Closed Bug 1828108 Opened 2 years ago Closed 2 years ago

Intermittent Wd | <random> - assert 0 == 1 (assert len(events) == 1)

Categories

(Remote Protocol :: WebDriver BiDi, defect, P5)

All
Android
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure)

Filed by: smolnar [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=412443409&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/HcsY_fUkTy-Yxr_Bcchmlw/runs/0/artifacts/public/logs/live_backing.log


INFO - TEST-PASS | /webdriver/tests/bidi/session/subscribe/contexts.py | test_subscribe_to_one_context_twice 
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - TEST-UNEXPECTED-FAIL | /webdriver/tests/bidi/session/subscribe/contexts.py | test_subscribe_to_one_context_and_then_to_all - assert 0 == 1
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - bidi_session = <webdriver.bidi.client.BidiSession object at 0x7f99fccf2e90>
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - subscribe_events = <function subscribe_events.<locals>.subscribe_events at 0x7f99fcd74b00>
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - top_context = {'children': [], 'context': '8a95d0c9-8634-4891-b483-336f0b11ae83', 'parent': None, 'url': 'about:blank'}
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - new_tab = {'context': '52c352f4-ac05-4d6f-bdab-da009e64d497'}
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - wait_for_event = <function wait_for_event.<locals>.wait_for_event at 0x7f99fcd74830>
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO - 
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     @pytest.mark.asyncio
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     async def test_subscribe_to_one_context_and_then_to_all(
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         bidi_session, subscribe_events, top_context, new_tab, wait_for_event
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     ):
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         # Subscribe for log events to a specific context
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         await subscribe_events(events=["log.entryAdded"], contexts=[top_context["context"]])
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         # Track all received log.entryAdded events in the events array
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         events = []
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         async def on_event(method, data):
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -             events.append(data)
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         remove_listener = bidi_session.add_event_listener("log.entryAdded", on_event)
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         # Trigger console event in the another context
[task 2023-04-14T12:06:53.694Z] 12:06:53     INFO -         buffered_event_expected_text = await create_console_api_message(
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -             bidi_session, new_tab, "text1"
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         )
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         assert len(events) == 0
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         # Trigger another console event in the observed context
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         on_entry_added = wait_for_event("log.entryAdded")
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         expected_text = await create_console_api_message(bidi_session, top_context, "text2")
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         await on_entry_added
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         assert len(events) == 1
INFO -         recursive_compare(
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -             {
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -                 "text": expected_text,
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -             },
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -             events[0],
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         )
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         events = []
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         # Subscribe to all contexts
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         await subscribe_events(events=["log.entryAdded"])
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -     
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO -         # Check that we received the buffered event
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO - >       assert len(events) == 1
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO - E       assert 0 == 1
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO - E         +0
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO - E         -1
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO - 
[task 2023-04-14T12:06:53.695Z] 12:06:53     INFO - bidi_session = <webdriver.bidi.client.BidiSession object at 0x7f99fccf2e90>
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - buffered_event_expected_text = 'text1'
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - events     = []
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - expected_text = 'text2'
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - new_tab    = {'context': '52c352f4-ac05-4d6f-bdab-da009e64d497'}
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - on_entry_added = <Future finished result={'args': [{'type': 'string', 'value': 'text2'}], 'level': 'info', 'method': 'log', 'source': {'context': '8a95d0c9-863...-336f0b11ae83', 'realm': '2147483649'}, ...}>
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - on_event   = <function test_subscribe_to_one_context_and_then_to_all.<locals>.on_event at 0x7f99fcd748c0>
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - remove_listener = <function BidiSession.add_event_listener.<locals>.<lambda> at 0x7f99fcd76170>
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - subscribe_events = <function subscribe_events.<locals>.subscribe_events at 0x7f99fcd74b00>
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - top_context = {'children': [],
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO -  'context': '8a95d0c9-8634-4891-b483-336f0b11ae83',
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO -  'parent': None,
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO -  'url': 'about:blank'}
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - wait_for_event = <function wait_for_event.<locals>.wait_for_event at 0x7f99fcd74830>
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - 
[task 2023-04-14T12:06:53.696Z] 12:06:53     INFO - tests/web-platform/tests/webdriver/tests/bidi/session/subscribe/contexts.py:122: AssertionError
[task 2023-04-14T12:06:53.697Z] 12:06:53     INFO - ...
[task 2023-04-14T12:06:53.698Z] 12:06:53     INFO - TEST-OK | /webdriver/tests/bidi/session/subscribe/contexts.py | took 7048ms

Failed on Android but due to missing trace output it's not possible to actually investigate.

OS: Unspecified → Android
Hardware: Unspecified → All
Summary: Intermittent /webdriver/tests/bidi/session/subscribe/contexts.py | test_subscribe_to_one_context_and_then_to_all - assert 0 == 1 → Intermittent Wd | <random> - assert 0 == 1 (assert len(events) == 1)
Duplicate of this bug: 1828687

As the android log cat from the recently duped bug shows we seem to miss cached console API entries:

04-18 11:18:36.312 22646 22661 I Gecko   : 1681816716312	RemoteAgent	DEBUG	WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a -> {"id":4,"method":"script.callFunction","params":{"functionDeclaration":"(text) => console.log(text)","target":{"context":"b951b3e3-0700-46b7-b039-205e06ba2f8c"},"awaitPromise":false,"arguments":[{"type":"string","value":"cached_message"}]}}
04-18 11:18:36.314 22646 22661 I Gecko   : 1681816716314	RemoteAgent	TRACE	Received command script.callFunction for destination ROOT
04-18 11:18:36.314 22646 22661 I Gecko   : 1681816716314	RemoteAgent	TRACE	Module root/script.jsm found for ROOT
04-18 11:18:36.316 22792 22807 D GeckoViewContentDelegateChild[C]: handleEvent: DOMContentLoaded
04-18 11:18:36.317 22646 22661 D GeckoViewContent: handleEvent: pagetitlechanged
04-18 11:18:36.322 22792 22807 D GeckoViewContent[C]: handleEvent: pageshow
04-18 11:18:36.323 22792 22807 D GeckoViewAutoFill[C]: handleEvent: pageshow
04-18 11:18:36.325 22792 22807 I Gecko   : 1681816716325	RemoteAgent	TRACE	MessageHandler WINDOW_GLOBAL for session 9fbd684a-ef93-4562-82d4-18e3ae1058c4 is being destroyed
04-18 11:18:36.326 22792 22807 I Gecko   : 1681816716326	RemoteAgent	TRACE	Unregistered MessageHandler WINDOW_GLOBAL for session 9fbd684a-ef93-4562-82d4-18e3ae1058c4
04-18 11:18:36.332 22792 22807 I Gecko   : 1681816716332	RemoteAgent	TRACE	Received command script.callFunctionDeclaration for destination WINDOW_GLOBAL
04-18 11:18:36.335 22792 22807 I Gecko   : 1681816716334	RemoteAgent	TRACE	Module windowglobal/script.jsm found for WINDOW_GLOBAL
04-18 11:18:36.336 22792 22807 I Gecko   : 1681816716336	RemoteAgent	TRACE	WebDriverProcessData actor created for PID 22792
04-18 11:18:36.344 22792 22807 I Web Content: cached_message
04-18 11:18:36.346 22712 22748 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008caa
04-18 11:18:36.348 22646 22661 I Gecko   : 1681816716348	RemoteAgent	DEBUG	WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a <- {"id":4,"result":{"realm":"57da1e66-d7b8-46db-a0c8-04f776303d40","type":"success","result":{"type":"undefined"}}}
04-18 11:18:36.356 22646 22661 I Gecko   : 1681816716356	RemoteAgent	DEBUG	WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a -> {"id":5,"method":"session.subscribe","params":{"events":["log.entryAdded"]}}
04-18 11:18:36.356 22646 22661 I Gecko   : 1681816716356	RemoteAgent	TRACE	Received command session.subscribe for destination ROOT
04-18 11:18:36.360 22646 22661 I Gecko   : 1681816716360	RemoteAgent	TRACE	Received command log._applySessionData for destination WINDOW_GLOBAL
04-18 11:18:36.362 22792 22807 I Gecko   : 1681816716362	RemoteAgent	TRACE	Received command log._applySessionData for destination WINDOW_GLOBAL
04-18 11:18:36.362 22676 22694 I Gecko   : 1681816716362	RemoteAgent	TRACE	Received command log._applySessionData for destination WINDOW_GLOBAL
04-18 11:18:36.367 22646 22661 I Gecko   : 1681816716367	RemoteAgent	DEBUG	WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a <- {"id":5,"result":{}}
04-18 11:18:36.369 22646 22661 D GeckoViewContentDelegateParent: receiveMessage: DispatcherMessage
04-18 11:18:36.377 22646 22661 I Gecko   : 1681816716377	RemoteAgent	DEBUG	WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a -> {"id":6,"method":"browsingContext.close","params":{"context":"b951b3e3-0700-46b7-b039-205e06ba2f8c"}}

Maybe we need a small delay at the end of the test to allow the browser to actually send the events on the device and to let them receive on the host?

https://treeherder.mozilla.org/logviewer?job_id=412831994&repo=mozilla-central&lineNumber=21347-21352

Julian, what do you think?

Flags: needinfo?(jdescottes)

I think this might be again a navigation issue, where we don't wait long enough for the initial about:blank. See the following part of the logs:

04-18 11:18:36.312 22646 22661 I Gecko : 1681816716312 RemoteAgent DEBUG WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a -> {"id":4,"method":"script.callFunction","params":{"functionDeclaration":"(text) => console.log(text)","target":{"context":"b951b3e3-0700-46b7-b039-205e06ba2f8c"},"awaitPromise":false,"arguments":[{"type":"string","value":"cached_message"}]}}
04-18 11:18:36.314 22646 22661 I Gecko : 1681816716314 RemoteAgent TRACE Received command script.callFunction for destination ROOT
04-18 11:18:36.314 22646 22661 I Gecko : 1681816716314 RemoteAgent TRACE Module root/script.jsm found for ROOT
04-18 11:18:36.316 22792 22807 D GeckoViewContentDelegateChild[C]: handleEvent: DOMContentLoaded
04-18 11:18:36.317 22646 22661 D GeckoViewContent: handleEvent: pagetitlechanged
04-18 11:18:36.322 22792 22807 D GeckoViewContent[C]: handleEvent: pageshow
04-18 11:18:36.323 22792 22807 D GeckoViewAutoFill[C]: handleEvent: pageshow
04-18 11:18:36.325 22792 22807 I Gecko : 1681816716325 RemoteAgent TRACE MessageHandler WINDOW_GLOBAL for session 9fbd684a-ef93-4562-82d4-18e3ae1058c4 is being destroyed
04-18 11:18:36.326 22792 22807 I Gecko : 1681816716326 RemoteAgent TRACE Unregistered MessageHandler WINDOW_GLOBAL for session 9fbd684a-ef93-4562-82d4-18e3ae1058c4
04-18 11:18:36.332 22792 22807 I Gecko : 1681816716332 RemoteAgent TRACE Received command script.callFunctionDeclaration for destination WINDOW_GLOBAL
04-18 11:18:36.335 22792 22807 I Gecko : 1681816716334 RemoteAgent TRACE Module windowglobal/script.jsm found for WINDOW_GLOBAL
04-18 11:18:36.336 22792 22807 I Gecko : 1681816716336 RemoteAgent TRACE WebDriverProcessData actor created for PID 22792
04-18 11:18:36.344 22792 22807 I Web Content: cached_message
04-18 11:18:36.346 22712 22748 E eglCodecCommon: glUtilsParamSize: unknow param 0x00008caa
04-18 11:18:36.348 22646 22661 I Gecko : 1681816716348 RemoteAgent DEBUG WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a <- {"id":4,"result":{"realm":"57da1e66-d7b8-46db-a0c8-04f776303d40","type":"success","result":{"type":"undefined"}}}
04-18 11:18:36.356 22646 22661 I Gecko : 1681816716356 RemoteAgent DEBUG WebDriverBiDiConnection 5a4c248c-3b15-47a7-b29f-ab3f2a6b157a -> {"id":5,"method":"session.subscribe","params":{"events":["log.entryAdded"]}}

Summary of what happens:

  • trigger the log message using callFunction
  • receive DOMContentLoaded
  • MessageHandler is being destroyed
  • try to subscribe to log.entryAdded

Sadly we don't log the creation of MessageHandlers, but I suspect that we create the log message in the temporary about:blank. And when we call subscribe this actually ends up in a new process, in a new MessageHandler etc...

To also support this theory the beginning of the test starts with :

04-18 11:18:35.997 22646 22661 I Gecko : 1681816715997 RemoteAgent TRACE [4] ProgressListener Start: expectNavigation=false resolveWhenStarted=false unloadTimeout=200 waitForExplicitStart=false
04-18 11:18:35.997 22646 22661 I Gecko : 1681816715997 RemoteAgent TRACE [4] ProgressListener Setting unload timer (200ms)
[... Removed a lot of logs of internal stuff which happened in between ... might explain why the 200ms timeout is not enough]
04-18 11:18:36.199 22646 22661 I Gecko : 1681816716199 RemoteAgent TRACE [4] ProgressListener No navigation detected: about:blank
04-18 11:18:36.199 22646 22661 I Gecko : 1681816716199 RemoteAgent TRACE [4] ProgressListener Stop: has error=false

You can see we bail from the initial navigation because of the 200ms timeout.

A successful run would typically use the following sequence:

0:08.42 pid:20290 1682325998650 RemoteAgent TRACE [12] ProgressListener Start: expectNavigation=false resolveWhenStarted=false unloadTimeout=200 waitForExplicitStart=false
0:08.43 pid:20290 1682325998650 RemoteAgent TRACE [12] ProgressListener Setting unload timer (200ms)
0:08.47 pid:20290 1682325998698 RemoteAgent TRACE [12] ProgressListener Check loading state: isStart=1 isStop=0
0:08.47 pid:20290 1682325998698 RemoteAgent TRACE [12] ProgressListener state=start: about:blank
0:08.47 pid:20290 1682325998698 RemoteAgent TRACE [12] ProgressListener Cleared the unload timer
0:08.47 pid:20290 1682325998699 RemoteAgent TRACE [12] ProgressListener Check loading state: isStart=0 isStop=16
0:08.47 pid:20290 1682325998699 RemoteAgent TRACE [12] ProgressListener Ignore aborted navigation error to the initial document, real document will be loaded.
0:08.47 pid:20290 1682325998699 RemoteAgent TRACE [12] ProgressListener Check loading state: isStart=1 isStop=0
0:08.48 pid:20290 1682325998703 RemoteAgent TRACE [12] ProgressListener Check loading state: isStart=0 isStop=16
0:08.48 pid:20290 1682325998703 RemoteAgent TRACE [12] ProgressListener state=stop: about:blank
0:08.48 pid:20290 1682325998703 RemoteAgent TRACE [12] ProgressListener Stop: has error=false

I think we have several intermittents where I repeatedly have the same conclusion let me try to regroup them

Flags: needinfo?(jdescottes)

(In reply to Julian Descottes [:jdescottes] from comment #6)

You can see we bail from the initial navigation because of the 200ms timeout.

Good point! Probably we have to use a separate multiplier for Android builds as well similar to debug, ccov etc:
https://searchfox.org/mozilla-central/rev/26790fecfcda622dab234b28859da721b80f3a35/remote/shared/Navigate.sys.mjs#24-40

Setting needinfo for Julian given that he wants to try to re-group all the related bugs.

Flags: needinfo?(jdescottes)

(thanks for the needinfo, I didn't find any other bug so far :) maybe we closed them all because they were quite low frequency?)

In this case lets keep the bug open for a little bit longer and check back by next week. We could close by then if no other failures came up.

Flags: needinfo?(jdescottes)
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE

https://treeherder.mozilla.org/logviewer?job_id=416343315&repo=try

Thsi try push has basically three failures all in /webdriver/tests/bidi/log/entry_added/event_buffer.py | test_console_log_cached_messages[console_api_log] .

Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

And given https://hg.mozilla.org/integration/autoland/rev/1872a0c7f171 we now accept PASS / FAIL on Android, so we won't see the failure again.

I wonder if we should undo that change and maybe others from a previous wpt downstream sync to have a base for try pushes when trying to investigate this failure.

No longer depends on: 1832891
No longer depends on: 1838382

This failure is potentially fixed by bug 1838382. Given that I'm unclear I'm going to mark the bug as WFM.

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.