Closed Bug 1228222 Opened 8 years ago Closed 8 years ago

Intermittent test_Fetch.js | Process still running after test!, | Test timed out

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox48 --- fixed
firefox49 --- fixed
firefox-esr45 --- fixed

People

(Reporter: philor, Assigned: bkelly)

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 1 obsolete file)

This is extremely low frequency.  Unfortunately I don't think I will have time to investigate it right now.  I'll stay CC'd to watch the orangefactor reports.
Flags: needinfo?(bkelly)
Summary: Intermittent test_Fetch.js | Test timed out → Intermittent test_Fetch.js | Process still running after test!, | Test timed out
Seems to be more frequent these days. Any chance you can take a look, Ben?
Flags: needinfo?(bkelly)
Yea, I'll add it to my list.  Leaving NI for now.
This is only failing on macos.  I wonder if we have a filesystem case issue given there is an xpcshell test called test_fetch.js and another called test_Fetch.js.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f247a1924003
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(bkelly)
So that try build didn't work.

The log, though, is really weird.  We are retrying the test, it times out, and then after logging the timeout message the test actually runs:

 17:42:24     INFO -  Retrying tests that failed when run in parallel.
 17:42:24     INFO -  TEST-START | dom/tests/unit/test_dom_fetch.js
 17:47:24  WARNING -  TEST-UNEXPECTED-TIMEOUT | dom/tests/unit/test_dom_fetch.js | Test timed out
 17:47:24     INFO -  TEST-INFO took 300003ms
 17:47:24     INFO -  >>>>>>>
 17:47:24     INFO -  PROCESS | 10645 | [10645] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /builds/slave/try-m64-d-00000000000000000000/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2724
 17:47:24     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
 17:47:24     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
 17:47:24     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
 17:47:24     INFO -  running event loop
 17:47:24     INFO -  dom/tests/unit/test_dom_fetch.js | Starting test_GetData
 17:47:24     INFO -  (xpcshell/head.js) | test test_GetData pending (2)
 17:47:24     INFO -  (xpcshell/head.js) | test pending (3)
 17:47:24     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (3)

Its like xpcshell isn't actually starting the test or something.
Let's see if just running the test sequentially helps.  I don't really want to spend a lot of time learning the inner workings of xpcshell.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=098f3faa871a
Oh, I see.  The xpcshell harness does not show stderr output unless the test fails.  And then it shows it after the failure message.

It seems we're timing out in test_getTestFailedConnect() trying to access ftp://localhost.  Maybe some of the mac instances are running ftp.  Lets not do that any more.
Comment on attachment 8755973 [details] [diff] [review]
Don't try to access ftp::/localhost to test "unknown server". r=asuth

The real fix here is not to use ftp://localhost.  I did fix up a couple function names to be unique, though, since I noticed they were cut&paste incorrectly.
Attachment #8755973 - Flags: review?(bugmail)
Comment on attachment 8755973 [details] [diff] [review]
Don't try to access ftp::/localhost to test "unknown server". r=asuth

Review of attachment 8755973 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/tests/unit/test_Fetch.js
@@ +202,5 @@
>  // test a failure to connect
>  add_test(function test_getTestFailedConnect() {
>    do_test_pending();
>    // try a server that's not there
> +  fetch("http://doesnotexist.mochi.test:8888/should/fail").then(response => {

You're converting this from a connect failure to a name resolution failure.  Although this is probably safe for our test automation where we control DNS and the test doesn't really seem to care about the semantics that much, this could be problematic for contributors.  For example, "dig @71.242.0.12 doesnotexist.mochi.test" returns 92.242.140.21 which redirects to "http://searchassist.verizon.com" if accessed via IP.  (71.242.0.12 is the Verizon fios philly-area DNS per http://drewgraybeal.blogspot.com/2015/01/verizon-fios-regional-dns-servers_27.html noting that it may not work from outside Verizon's networks.)  For that specific server, the port appears to be firewalled to drop the packets so the connection will just sit there until it eventually times out, inducing a test failure.

I would suggest using something like "http://localhost:4/should/fail" instead if we want to continue to experience attempting to connect to a port that's not listening.  I've chosen 4 because it's a low/privileged-bind port that's unassigned and not on our default port blacklist at https://dxr.mozilla.org/mozilla-central/source/netwerk/base/nsIOService.cpp#90 so we will attempt to make the connection but it's also not a port that bind(0) will hand out nor any system services should really be binding to (I'm assuming with no documented proof available).

Alternately, we can do something more involved like invoking bind(0) to get a port.  That seems possibly more complicated and error-prone, so I like localhost:4 more, but it's your call.  r+ contingent on addressing this somehow, though.
Attachment #8755973 - Flags: review?(bugmail) → review+
Updated to use http://localhost:4 as suggested.  Thanks!
Attachment #8755973 - Attachment is obsolete: true
Attachment #8756391 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/915fe7554402
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.