Closed Bug 1572086 Opened 5 years ago Closed 4 years ago

Perma Mn-fis testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py TestNavigate.test_invalid_url | IOError: Process killed because the connection to Marionette server is lost

Categories

(Testing :: Marionette Client and Harness, defect, P2)

Version 3
defect

Tracking

(Fission Milestone:M6)

RESOLVED WORKSFORME
Fission Milestone M6

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 obsolete file)

As it looks like also when looking at the logs, loading an URI with an invalid protocol causes Firefox to enter an infinite loop of destroying and attaching the frame listener. It never ends loading the page.

1565185202443   Marionette  DEBUG   3 -> [0,16,"WebDriver:Navigate",{"url":"thisprotocoldoesnotexist://"}]
1565185202455   Marionette  TRACE   [6442450945] Received DOM event beforeunload for about:blank
1565185202573   Marionette  TRACE   [6442450945] Received DOM event pagehide for about:blank
1565185202574   Marionette  TRACE   [6442450945] Received DOM event unload for about:blank
1565185202575   Marionette  TRACE   [6442450945] Received observer notification outer-window-destroyed
[64883, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp, line 198
[64883, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp, line 198
### XPCOM_MEM_LEAK_LOG defined -- logging leaks to stderr
[Child 64879, Main Thread] WARNING: nsAppShell::Exit() called redundantly: file /builds/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 725

###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

[Child 64877, Main Thread] WARNING: No active window: file /builds/worker/workspace/build/src/js/xpconnect/src/XPCJSContext.cpp, line 664
--DOCSHELL 0x11a70e000 == 0 [pid = 64879] [id = {8ed2430d-d3f9-0e41-9a6a-c22c4a621113}] [url = about:blank]
--DOMWINDOW == 2 (0x11a776020) [pid = 64879] [serial = 1] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 1 (0x11a7e1800) [pid = 64879] [serial = 3] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 0 (0x11a74e000) [pid = 64879] [serial = 2] [outer = 0x0] [url = about:blank]

== BloatView: ALL (cumulative) LEAK STATISTICS, tab process 64879
nsTraceRefcnt::DumpStatistics: 770 entries
nsStringStats
 => mAllocCount:           7049
 => mReallocCount:            0
 => mFreeCount:            7049
 => mShareCount:           5962
 => mAdoptCount:            409
 => mAdoptFreeCount:        409
 => Process ID: 64879, Thread ID: 4618216896
++DOCSHELL 0x112d17000 == 1 [pid = 64883] [id = {622f7983-7c6a-5e45-8de1-e2751494be45}]
++DOMWINDOW == 1 (0x112d70020) [pid = 64883] [serial = 1] [outer = 0x0]
++DOMWINDOW == 2 (0x112d52c00) [pid = 64883] [serial = 2] [outer = 0x112d70020]
1565185202935   Marionette  TRACE   [10737418241] Frame script loaded
1565185202945   Marionette  TRACE   [10737418241] Frame script registered
1565185202973   Marionette  TRACE   [10737418241] Check readyState uninitialized for about:blank
[64884, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp, line 198
[64884, MainThread] WARNING: XPCOM objects created/destroyed from static ctor/dtor: file /builds/worker/workspace/build/src/xpcom/base/nsTraceRefcnt.cpp, line 198
### XPCOM_MEM_LEAK_LOG defined -- logging leaks to stderr
[Child 64883, Main Thread] WARNING: nsAppShell::Exit() called redundantly: file /builds/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 725

###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost

--DOCSHELL 0x112d17000 == 0 [pid = 64883] [id = {622f7983-7c6a-5e45-8de1-e2751494be45}] [url = about:blank]
--DOMWINDOW == 1 (0x112d70020) [pid = 64883] [serial = 1] [outer = 0x0] [url = about:blank]
--DOMWINDOW == 0 (0x112d52c00) [pid = 64883] [serial = 2] [outer = 0x0] [url = about:blank]

== BloatView: ALL (cumulative) LEAK STATISTICS, tab process 64883
nsTraceRefcnt::DumpStatistics: 646 entries
nsStringStats
 => mAllocCount:           6437
 => mReallocCount:            0
 => mFreeCount:            6437
 => mShareCount:           5524
 => mAdoptCount:            322
 => mAdoptFreeCount:        322
 => Process ID: 64883, Thread ID: 4613346752
++DOCSHELL 0x11ea17800 == 1 [pid = 64884] [id = {54b8a251-c699-ae41-bb81-10fe0c0b71e1}]
++DOMWINDOW == 1 (0x11ea71020) [pid = 64884] [serial = 1] [outer = 0x0]
++DOMWINDOW == 2 (0x11ea49400) [pid = 64884] [serial = 2] [outer = 0x11ea71020]
[Parent 64874, Classifier Update] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/workspace/build/src/toolkit/components/url-classifier/LookupCache.cpp, line 882
[..]
1565185203361   Marionette  TRACE   [12884901889] Frame script loaded
1565185203362   Marionette  TRACE   [12884901889] Frame script registered
1565185203387   Marionette  TRACE   [12884901889] Check readyState uninitialized for about:blank
[..]

How to run the test:

  1. If wanted edit test_navigation.py, and disable all tests beside test_invalid_url
  2. Run mach marionette-test testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py -vv --gecko-log - --setpref fission.autostart=true

Nika or Mike, could one of you please have a look? It's not clear to me if that is a problem with framescripts (and Marionette) or a general issue in Firefox, which needs to be fixed.

Flags: needinfo?(nika)
Flags: needinfo?(mconley)
Summary: Intermittent Mn-fis testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py TestNavigate.test_invalid_url | IOError: Process killed because the connection to Marionette server is lost → Perma Mn-fis testing/marionette/harness/marionette_harness/tests/unit/test_navigation.py TestNavigate.test_invalid_url | IOError: Process killed because the connection to Marionette server is lost

This is interesting. I'm guessing this is caused by the code in E10SUtils.validatedWebRemoteType: https://searchfox.org/mozilla-central/rev/9775cca0a10a9b5c5f4e15c8f7b3eff5bf91bbd0/toolkit/modules/E10SUtils.jsm#125

We probably fail to create a principal in the corresponding line, and end up with a null principal. As every process where we do this generates a new null principal, we loop forever trying to load into a new content process. Eventually we'll want to have processes for null principals, but we'll actually need to track the null principal through the process in that case.

Flags: needinfo?(nika)
Flags: needinfo?(mconley)
Assignee: nobody → nika
Fission Milestone: --- → M4
Priority: P5 → P2

This appears to have produced some new test failures with M-fis which I haven't had the time to look into: https://treeherder.mozilla.org/#/jobs?repo=try&revision=282703c0d23cfcf29aa2a29a137a31e4d31504a6 (ignore the WPT and Mn failures - they were accidentally queued up)

Flags: needinfo?(afarre)

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:Nika, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(nika)
Flags: needinfo?(nika)
Flags: needinfo?(afarre)

Re-requesting need-info in case the message was overseen. Nika, do you have an idea when you would have the time to continue on this bug? Thanks.

Flags: needinfo?(nika)

I'm not sure if this patch is fully correct anymore. A better approach might need to be taken to fixing this issue.

Given that we aren't running marionette on central right now, I'm inclined to abandon this patch for a bit, and try to fix the issues as part of a larger-scale refactoring of how we perform process selection with fission enabled.

Assignee: nika → nobody
Flags: needinfo?(nika)
Attachment #9084095 - Attachment is obsolete: true

Roll unfixed test bugs from Fission Milestone M4 to M4.1

Fission Milestone: M4 → M4.1

Tracking for Fission milestone M6 because we want to be able to run Marionette tests with Fission before enabling in Nightly.

Fission Milestone: M4.1 → M6

I cannot see this test failing anymore with recent Nightly builds. As such closing as WFM.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
Product: Testing → Remote Protocol
Moving bug to Testing::Marionette Client and Harness component per bug 1815831.
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: