1303346 - [e10s] Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused)

Reporter

Description

•

8 years ago

treeherder

Filed by: hskupin [at] gmail.com

https://treeherder.mozilla.org/logviewer.html#?job_id=3618908&repo=autoland

https://queue.taskcluster.net/v1/task/N8pZlwzAQ_qMD8X7B_5YpA/runs/0/artifacts/public%2Flogs%2Flive_backing.log

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 1

•

8 years ago

With the patch on bug 1301093 landed on autoland I see a new test failure popping up quite often now:


https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=85ec67dd11e5626a4cf0efc2abda9e3d44cb63b2&filter-searchStr=fxfn&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=running&filter-resultStatus=pending&filter-resultStatus=runnable

[task 2016-09-16T11:32:42.182647Z] 11:32:42     INFO - TEST-UNEXPECTED-ERROR | test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused)
[task 2016-09-16T11:32:42.185021Z] 11:32:42     INFO - Traceback (most recent call last):
[task 2016-09-16T11:32:42.185657Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette/marionette_test/testcases.py", line 162, in run
[task 2016-09-16T11:32:42.186575Z] 11:32:42     INFO -     testMethod()
[task 2016-09-16T11:32:42.188007Z] 11:32:42     INFO -   File "/home/worker/workspace/build/tests/firefox-ui/tests/puppeteer/test_toolbars.py", line 133, in test_load_url
[task 2016-09-16T11:32:42.189470Z] 11:32:42     INFO -     Wait(self.marionette).until(lambda mn: mn.get_url() == data_uri)
[task 2016-09-16T11:32:42.190936Z] 11:32:42     INFO -   File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
[task 2016-09-16T11:32:42.191908Z] 11:32:42     INFO -     self.gen.throw(type, value, traceback)
[task 2016-09-16T11:32:42.193930Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1359, in using_context
[task 2016-09-16T11:32:42.196095Z] 11:32:42     INFO -     self.set_context(scope)
[task 2016-09-16T11:32:42.197425Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1337, in set_context
[task 2016-09-16T11:32:42.198901Z] 11:32:42     INFO -     self._send_message("setContext", {"value": context})
[task 2016-09-16T11:32:42.200096Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/decorators.py", line 55, in _
[task 2016-09-16T11:32:42.201889Z] 11:32:42     INFO -     m.force_shutdown()
[task 2016-09-16T11:32:42.205262Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/decorators.py", line 42, in _
[task 2016-09-16T11:32:42.206901Z] 11:32:42     INFO -     return func(*args, **kwargs)
[task 2016-09-16T11:32:42.207784Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/marionette.py", line 682, in _send_message
[task 2016-09-16T11:32:42.209007Z] 11:32:42     INFO -     msg = self.client.request(name, params)
[task 2016-09-16T11:32:42.209919Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/transport.py", line 274, in request
[task 2016-09-16T11:32:42.211358Z] 11:32:42     INFO -     self.send(cmd)
[task 2016-09-16T11:32:42.212357Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/transport.py", line 236, in send
[task 2016-09-16T11:32:42.213340Z] 11:32:42     INFO -     self.connect()
[task 2016-09-16T11:32:42.214352Z] 11:32:42     INFO -   File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/transport.py", line 214, in connect
[task 2016-09-16T11:32:42.215307Z] 11:32:42     INFO -     self.sock.connect((self.addr, self.port))
[task 2016-09-16T11:32:42.216360Z] 11:32:42     INFO -   File "/usr/lib/python2.7/socket.py", line 228, in meth
[task 2016-09-16T11:32:42.217297Z] 11:32:42     INFO -     return getattr(self._sock,name)(*args)
[task 2016-09-16T11:32:42.219215Z] 11:32:42

Blocks: 1301093

status-firefox51: --- → affected

Keywords: regression

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

8 years ago

Flags: needinfo?(mak77)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 2

•

8 years ago

It seems to be e10s only.

Summary: Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused) → [e10s] Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused)

Marco Bonardo [:mak]

Comment 3

•

8 years ago

The difference is that before we were supposed to wait for the awesomebar results before handling Enter, but we were not doing that.
Now we should be constantly waiting, so if you type something and press Enter, we will wait for the first result to come before we handle the Enter action.
Most of the tests were expecting a content load, so it didn't require big changes to tests.

Maybe you can tell me more about the test behavior and actions?

Flags: needinfo?(mak77)

Summary: [e10s] Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused) → Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused)

Marco Bonardo [:mak]

Updated

•

8 years ago

Summary: Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused) → [e10s] Intermittent test_toolbars.py TestLocationBar.test_load_url | IOError: Process has been closed (Exit code: -15) (Reason: [Errno 111] Connection refused)

Comment hidden (Intermittent Failures Robot)

24 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 17
* mozilla-central: 6
* fx-team: 1

Platform breakdown:
* linux64: 24

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-09-16&endday=2016-09-16&tree=all

Comment hidden (Intermittent Failures Robot)

22 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 15
* mozilla-central: 3
* fx-team: 2
* try: 1
* autoland: 1

Platform breakdown:
* linux64: 22

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-09-17&endday=2016-09-17&tree=all

Comment hidden (Intermittent Failures Robot)

51 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 21
* mozilla-inbound: 17
* mozilla-central: 9
* fx-team: 3
* try: 1

Platform breakdown:
* linux64: 51

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-09-12&endday=2016-09-18&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 7

•

8 years ago

What concerns me here is that Firefox now starts to quit somehow. Not sure what exit code -15 exactly means, but maybe its a sigterm it could be related to a content process crash? It definitely started with your patch being landed and is our top failure now.

The test itself is pretty simple:

https://dxr.mozilla.org/mozilla-central/rev/f0f15b7c6aa77a0c5750918aa0a1cb3dc82185bc/testing/firefox-ui/tests/puppeteer/test_toolbars.py#128
https://dxr.mozilla.org/mozilla-central/rev/f0f15b7c6aa77a0c5750918aa0a1cb3dc82185bc/testing/puppeteer/firefox/firefox_puppeteer/ui/browser/toolbars.py#220

We focus the locationbar, enter a data URI, and press enter immediately.

Tracy Walker [:tracy]

Updated

•

8 years ago

Blocks: e10s-tests

tracking-e10s: --- → +

Comment hidden (Intermittent Failures Robot)

5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* fx-team: 4
* autoland: 1

Platform breakdown:
* linux64: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-09-19&endday=2016-09-25&tree=all

Comment hidden (Intermittent Failures Robot)

5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 3
* try: 1
* mozilla-inbound: 1

Platform breakdown:
* linux64: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-09-26&endday=2016-10-02&tree=all

Ryan VanderMeulen [:RyanVM]

Updated

•

8 years ago

status-firefox50: --- → unaffected

status-firefox52: --- → affected

Comment hidden (Intermittent Failures Robot)

8 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 5
* autoland: 2
* fx-team: 1

Platform breakdown:
* linux64: 8

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-10-03&endday=2016-10-09&tree=all

Comment hidden (Intermittent Failures Robot)

8 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 4
* mozilla-inbound: 3
* fx-team: 1

Platform breakdown:
* linux64: 8

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-10-10&endday=2016-10-16&tree=all

Gerry Chang [:gchang]

Comment 12

•

8 years ago

Hi ::mak,
Per comment #7, can you help shed some light here?

Flags: needinfo?(mak77)

Marco Bonardo [:mak]

Comment 13

•

8 years ago

(In reply to Henrik Skupin (:whimboo) from comment #7)
> What concerns me here is that Firefox now starts to quit somehow. Not sure
> what exit code -15 exactly means, but maybe its a sigterm it could be
> related to a content process crash? 

Is the "Connection refused" error coming from marionette and meaning it lost the connection to Firefox? I was wondering if the whole thing was caused by an outside connection that the harness disallows, and iirc we bailout abruptly if we try to reach the outside worls... but if it's just marionette indicating it lost the connection to FF, it's unlikely.

I don't have a lot of ideas, as I said the only difference is that before we were basically not opening the awesomebar popup and not waiting for any results, while now we may be waiting for at least one result (maybe more are incoming) and we open the popup briefly. I didn't see any reports in the wild about crashes from the awesomebar so doesn't look like it's a known crash or similar.
May the popup confuse your code? Do we have evidence Firefox crashed, like a desktop screenshot? is there a way to get a stack of that crash?

Flags: needinfo?(mak77)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 14

•

8 years ago

Sadly this crash only happens for opt, and pgo builds. Very rarely also for ASAN builds. But all those build types do not really give helpful information via the logs. As long as this is not reproducible I have no idea how to track this down.

Exit code -15 at least means SIGTERM, so something killed Firefox. It's clearly not Marionette client because then we do not log the exit code. Reasons could be different why Firefox gets closed.

So far I haven't seen this behavior for our tests run against nightly builds for both central and aurora. Maybe it is related to the Taskcluster docker workers which run Ubuntu 16.04 while our machines are on Ubuntu 14.04. We switched from 12.04 to 16.04 via bug 1290192 on August 3rd. But there is still more than a months until this failure has been started.

Marco Bonardo [:mak]

Comment 15

•

8 years ago

it should be possible to generate optimized build with debug symbols (ac_add_options --enable-optimize ac_add_options --enable-debug-symbols), I wonder if that may help with getting a stack from the harness?

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 16

•

8 years ago

If Firefox gets killed from the OS or another process we do not generate minidumps. Also those would not be that helpful either given that Firefox abruptly stops working at some random code. 

The occurrence of this failure is still so low that it might take a while even when using a one click loaner on Taskcluster.

Comment hidden (Intermittent Failures Robot)

10 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 5
* mozilla-inbound: 4
* mozilla-central: 1

Platform breakdown:
* linux64: 10

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-10-17&endday=2016-10-23&tree=all

Comment hidden (Intermittent Failures Robot)

9 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 7
* mozilla-central: 1
* fx-team: 1

Platform breakdown:
* linux64: 9

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-10-24&endday=2016-10-30&tree=all

Milan Sreckovic [:milan] (needinfo for best results)

Comment 19

•

8 years ago

I can't tell how critical this issue is to fix in 51, and we're about to go to Beta.  The SIGTERM is a bit worrisome, but do we have an idea if this is somehow test only, and wouldn't show up in "real world"?

Flags: needinfo?(mak77)

OS: Unspecified → Linux

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 20

•

8 years ago

Yesterday I landed the improved crash handling for Marionette via bug 1299216. So what we would need is another occurrence of this abort. If it is a crash we should see details, if not it might be a system triggered abort of Firefox.

Marco Bonardo [:mak]

Comment 21

•

8 years ago

I don't have an answer cause I have absolutely no idea what causes this crash without a stack.
As an evidence, I didn't see crash bugs posted recently regarding the awesomebar and I didn't notice anything suspicious on crash-stats. I think crash-stats is our best bet at this point to tell if something is worrisome.
I'm leaving the ni open, to see if the new crash handling can tell us more.

Peter Chang[:pchang]

Comment 22

•

8 years ago

I also saw 19 total failure from 10/24 to 11/21.

Peter Chang[:pchang]

Comment 23

•

8 years ago

s/also/only

Marco Bonardo [:mak]

Comment 24

•

8 years ago

Henrik, did we get any additional info from the improved crash handling fix?

Flags: needinfo?(mak77) → needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 25

•

8 years ago

Hm, the fix landed on Nov 8th, but OrangeFactor doesn't list any failures anymore. The last one is from Oct 30th. Maybe failures aren't getting starred correctly, or something fixed this issue. I will have a quick scan through Treeherder and figure out.

Flags: needinfo?(hskupin)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 26

•

8 years ago

Actually I missed to update the target date when querying OF. So we still have a couple of test failures:

https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-10-24&endday=2016-11-21&tree=all

Right now I do not see any benefit from this improved crash handling. So it seems like Firefox get killed by the OS and breakpad doesn't catch it. 

While checking the logs on Treeherder I noticed that we miss the gecko.log file those days. I filed bug 1319028 to bring it back. I'm sure we will get some more details there.

Depends on: 1319028

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 27

•

8 years ago

So there was a failure today:

https://treeherder.mozilla.org/#/jobs?repo=autoland&filter-searchStr=Firefox%20UI&bugfiler&fromchange=5ac0fb38b65024441bcc38bf899ea686c283e798&selectedJob=7181196

Log file:
https://public-artifacts.taskcluster.net/Y68Rnwp_TBS5MJlRRo0KJw/0/public/logs/live_backing.log

Failing lines:

> [task 2016-11-24T09:51:41.381944Z] 09:51:41     INFO -  1479981101376	Marionette	TRACE	conn50 -> [0,155,"setContext",{"value":"content"}]
> [task 2016-11-24T09:51:41.382614Z] 09:51:41     INFO -  1479981101377	Marionette	TRACE	conn50 <- [1,155,null,{}]
> [task 2016-11-24T09:51:41.536896Z] 09:51:41     INFO -  1479981101532	Marionette	TRACE	conn50 -> [0,156,"getCurrentUrl",null]
> [task 2016-11-24T09:52:41.459588Z] 09:52:41     INFO -  1479981161455	Marionette	DEBUG	Closed connection conn50

So the client sends a message to get the current URL. But there is no response for 60s. Usually we kill Firefox ourselves but that should result in a different failure message. So I wonder if something strange is going on with the connection and that Windows killed Firefox.

Comment hidden (Intermittent Failures Robot)

8 failures in 623 pushes (0.013 failures/push) were associated with this bug in the last 7 days.  

Repository breakdown:
* autoland: 5
* mozilla-inbound: 2
* mozilla-central: 1

Platform breakdown:
* linux64: 8

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2016-11-21&endday=2016-11-27&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 29

•

8 years ago

I actually found a case where I can reproduce this failure. It is handled by bug 1305659. Maybe a fix for that bug could also solve our failure here.

Depends on: 1305659

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 30

•

8 years ago

I'm fairly sure now that the underlying issue for this problem is bug 1322277. The requested page for this test is not served correctly and usually we would run into a page load timeout in Marionette. But given that the socket timeout is shorter, Marionette kills Firefox too early.

Depends on: 1322277
No longer depends on: 1305659, 1319028

Astley Chen (inactive)

Updated

•

7 years ago

status-firefox53: --- → affected

Priority: -- → P2

Hsin-Yi Tsai (she/her) [:hsinyi]

Comment 31

•

7 years ago

In triage meeting: we don't think this is serious enough for blocking 51, and this is kinda late for 51. So, change the status to fixoptional.

status-firefox51: affected → fix-optional

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox51: fix-optional → wontfix

status-firefox52: affected → fix-optional

status-firefox53: affected → fix-optional

status-firefox54: --- → fix-optional

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 32

•

7 years ago

This failure hasn't been seen for a while. In detail the last failure happened on 12 Dec 2016, 15:17. Also the depending bug 1322277 just got fixed. I think we should mark the bug as WFM and reopen if it is happening again.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → WORKSFORME

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 33

•

7 years ago

It started to happen again like:

https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=Firefox%20&filter-tier=1&filter-tier=2&filter-tier=3&bugfiler&selectedJob=101070077

Status: RESOLVED → REOPENED

status-firefox55: --- → affected

Resolution: WORKSFORME → ---

Marco Bonardo [:mak]

Comment 34

•

7 years ago

we enabled search suggestions by default. I suspect that's related.
You want to set browser.urlbar.userMadeSearchSuggestionsChoice to true and browser.urlbar.suggest.searches to false.

provided these tests are not respecting prefs_general.js

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 35

•

7 years ago

Marco, is there any relationship with bug 1303781? I ask because bug 1257508 started to fail at the same time and is causes by the same Fx ui test.

Flags: needinfo?(mak77)

Marco Bonardo [:mak]

Comment 36

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #35)
> Marco, is there any relationship with bug 1303781? I ask because bug 1257508
> started to fail at the same time and is causes by the same Fx ui test.

Comment 34 is something you may want to do regardless (maybe in a separate bug?), because otherwise any typing in the locationbar will start a remote connection, that we don't want.

bug 1303781 could also be related, and maybe it's more likely to be the culprit (I keep forgetting that "[Errno 111] Connection refused" is not about remote connections but about marionette connection), since it changes the way the location bar buffers keyboard actions like DOWN, TAB, ENTER.

So here the test is typing a url and pressing Enter and that times out. Though doing the same manually in the browser clearly doesn't timeout. So there must be some difference in how it synthesize things.
The alternative is there may be a condition where we don't replay the delayed keys, but it won't be trivial to find out.

Flags: needinfo?(mak77)

Marco Bonardo [:mak]

Comment 37

•

7 years ago

as a side question, maybe you should make these test take a screenshot of the browser when they timeout, like mochitest-browser, at least here we could verify that the text has been properly typed in the locationbar.

Comment hidden (Intermittent Failures Robot)

22 failures in 157 pushes (0.14 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* autoland: 11
* mozilla-inbound: 5
* mozilla-central: 5
* try: 1

Platform breakdown:
* linux32: 20
* linux64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2017-05-25&endday=2017-05-25&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Updated

•

7 years ago

Comment 39

•

7 years ago

(In reply to Marco Bonardo [::mak] from comment #37)
> as a side question, maybe you should make these test take a screenshot of
> the browser when they timeout, like mochitest-browser, at least here we
> could verify that the text has been properly typed in the locationbar.

We usually take screenshots, but here it doesn't work because the browser is somehow gone. So there is nothing to screenshot.

Anyway I had a look at the logs which are now much better than before. A lot of improvements have been added recently. So here the current issue:

[task 2017-05-25T23:44:45.817328Z] 23:44:45     INFO -  1495755885813	Marionette	TRACE	29 -> [0,149,"sendKeysToElement",{"text":"data:text/html,<title>Title</title>","id":"803319f9-f17b-4e40-add5-9b4fc12a8f74"}]
[task 2017-05-25T23:44:46.064663Z] 23:44:46     INFO -  1495755886063	Marionette	TRACE	29 <- [1,149,null,{}]
[task 2017-05-25T23:44:46.153077Z] 23:44:46     INFO -  1495755886150	Marionette	TRACE	29 -> [0,150,"getContext",{}]
[task 2017-05-25T23:44:46.154198Z] 23:44:46     INFO -  1495755886152	Marionette	TRACE	29 <- [1,150,null,{"value":"chrome"}]
[task 2017-05-25T23:44:46.163142Z] 23:44:46     INFO -  1495755886161	Marionette	TRACE	29 -> [0,151,"setContext",{"value":"content"}]
[task 2017-05-25T23:44:46.165804Z] 23:44:46     INFO -  1495755886163	Marionette	TRACE	29 <- [1,151,null,{}]
[task 2017-05-25T23:44:46.201124Z] 23:44:46     INFO -  1495755886195	Marionette	TRACE	29 -> [0,152,"getCurrentUrl",{}]
[task 2017-05-25T23:44:46.237937Z] 23:44:46     INFO -  1495755886230	Marionette	TRACE	29 <- [1,152,null,{"value":"about:"}]
[task 2017-05-25T23:44:46.399875Z] 23:44:46     INFO -  1495755886394	Marionette	TRACE	29 -> [0,153,"getCurrentUrl",{}]
[task 2017-05-25T23:44:46.438103Z] 23:44:46     INFO -  ++DOCSHELL 0xe7dd9c00 == 1 [pid = 1024] [id = {8050bfe7-bff5-4150-a2ef-e87e3175799e}]
[task 2017-05-25T23:44:46.534616Z] 23:44:46     INFO -  ++DOMWINDOW == 1 (0xe7ddd000) [pid = 1024] [serial = 1] [outer = (nil)]
[task 2017-05-25T23:44:46.837284Z] 23:44:46     INFO -  ++DOMWINDOW == 2 (0xebf61800) [pid = 1024] [serial = 2] [outer = 0xe7ddd000]
[task 2017-05-25T23:44:46.877471Z] 23:44:46     INFO -  1495755886873	Marionette	DEBUG	loaded listener.js

`sendKeysToElement` is used to send the data to the location bar, and which triggers a page load. Then when we call `getCurrentUrl` right afterward a remotness change comes into the way which causes Marionette to hang because the now unloaded frame script cannot send the reply for `getCurrentUrl` back to the chrome process.

Andreas, I don't know why we have to ask the listener for the current URL. I would suggest to change that and simply get the value from the chrome process which also knows about it. Are you in favor of that? This should also improve the timing a lot given that the call to the content process can take a couple of seconds on slow test machines and debug builds.

Flags: needinfo?(ato)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 40

•

7 years ago

(In reply to Marco Bonardo [::mak] from comment #34)
> we enabled search suggestions by default. I suspect that's related.
> You want to set browser.urlbar.userMadeSearchSuggestionsChoice to true and
> browser.urlbar.suggest.searches to false.

I filed bug 1368034 to get this fixed. Thanks.

Andreas Tolfsen ❲:ato❳

Comment 41

•

7 years ago

(In reply to Henrik Skupin (:whimboo) from comment #39)
> Andreas, I don't know why we have to ask the listener for the current URL. I
> would suggest to change that and simply get the value from the chrome
> process which also knows about it. Are you in favor of that? This should
> also improve the timing a lot given that the call to the content process can
> take a couple of seconds on slow test machines and debug builds.

Yes, I have never understood why chrome context has to call the listener script to get the current URL.  Will you make the change?

Flags: needinfo?(ato)

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 42

•

7 years ago

(In reply to Andreas Tolfsen ‹:ato› from comment #41)
> Yes, I have never understood why chrome context has to call the listener
> script to get the current URL.  Will you make the change?

Great. I will do it and just filed bug 1368101.

Depends on: 1368101

Comment hidden (Intermittent Failures Robot)

19 failures in 155 pushes (0.123 failures/push) were associated with this bug yesterday.   

Repository breakdown:
* autoland: 9
* mozilla-inbound: 6
* mozilla-central: 3
* try: 1

Platform breakdown:
* linux32: 15
* linux64: 4

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2017-05-26&endday=2017-05-26&tree=all

Comment hidden (Intermittent Failures Robot)

79 failures in 891 pushes (0.089 failures/push) were associated with this bug in the last 7 days. 

This is the #15 most frequent failure this week. 

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **  

Repository breakdown:
* autoland: 37
* mozilla-inbound: 21
* mozilla-central: 13
* try: 8

Platform breakdown:
* linux32: 58
* linux64: 21

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2017-05-22&endday=2017-05-28&tree=all

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 45

•

7 years ago

This should be fixed with my patch on bug 1368101 which already landed on autoland. I will observe the next days.

Henrik Skupin [:whimboo][⌚️UTC+1]

Assignee

Comment 46

•

7 years ago

Since my aforementioned patch got landed I haven't seen this failure on autoland anymore. Marking as fixed.

Assignee: nobody → hskupin

Status: REOPENED → RESOLVED

Closed: 7 years ago → 7 years ago

status-firefox55: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

7 years ago

Whiteboard: [stockwell fixed]

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox52: fix-optional → wontfix

status-firefox53: fix-optional → wontfix

status-firefox-esr52: --- → fix-optional

Comment hidden (Intermittent Failures Robot)

16 failures in 820 pushes (0.02 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-inbound: 7
* autoland: 4
* try: 2
* graphics: 2
* mozilla-central: 1

Platform breakdown:
* linux64: 9
* linux32: 7

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2017-05-29&endday=2017-06-04&tree=all

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox54: fix-optional → wontfix

Ryan VanderMeulen [:RyanVM]

Comment 48

•

7 years ago

In practice, we're not hitting this on ESR52. Let's just wontfix and get it off the radar.

status-firefox-esr52: fix-optional → wontfix

Comment hidden (Intermittent Failures Robot)

1 failures in 939 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-esr52: 1

Platform breakdown:
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1303346&startday=2017-08-28&endday=2017-09-03&tree=all