Closed Bug 971742 Opened 8 years ago Closed 7 years ago

Intermittent Android test_URIs.js | Test timed out | test failed (with xpcshell return code: -1)

Categories

(Core :: Networking, defect)

ARM
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox39 --- fixed
firefox40 --- fixed
firefox41 --- fixed
firefox-esr31 --- wontfix
firefox-esr38 --- fixed
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: RyanVM, Assigned: gbrown)

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file)

Fun story, this has actually been failing intermittently since late November, but were being starred under bug 873053 until a recent harness fix made that go away. Looks like it's silently crashing then timing out.

https://tbpl.mozilla.org/php/getParsedLog.php?id=34558678&tree=Mozilla-Central

Android 2.2 Tegra mozilla-central opt test xpcshell on 2014-02-12 06:07:11 PST for push ed270ce6af69
slave: tegra-068

TEST-INFO | /builds/tegra-068/test/build/tests/xpcshell/tests/netwerk/test/unit/test_URIs.js | running test ...
TEST-UNEXPECTED-FAIL | /builds/tegra-068/test/build/tests/xpcshell/tests/netwerk/test/unit/test_URIs.js | Test timed out
WARNING: killProcess(): forceKill parameter unsupported on SUT
TEST-UNEXPECTED-FAIL | /builds/tegra-068/test/build/tests/xpcshell/tests/netwerk/test/unit/test_URIs.js | test failed (with xpcshell return code: -1), see following log:
>>>>>>>

<snip>

TEST-PASS | test_URIs.js | [do_check_uri_eqExceptRef : 624] true == true
TEST-INFO | test_URIs.js | [do_check_uri_eqExceptRef : 626] (uri equalsExceptRef check: 'http://10.32.4.239/' == 'http://10.32.4.239/')
TEST-PASS | test_URIs.js | [do_check_uri_eqExceptRef : 627] true == true
TEST-INFO | test_URIs.js | [do_check_uri_eqExceptRef : 623] (uri equalsExceptRef check: 'http://10.32.4.239/' == 'http://10.32.4.239/#myRef')
TEST-PASS | test_URIs.js | [do_check_uri_eqExceptRef : 624] true == true
TEST-INFO | test_URIs.js | [do_check_uri_eqExceptRef : 626] (uri equalsExceptRef check: 'http://10.32.4.239/#myRef' == 'http://10.32.4.239/')
TEST-PASS | test_URIs.js | [do_check_uri_eqExceptRef : 627] true == true
TEST-INFO | test_URIs.js | [do_test_mutate_ref : 844] testing that clearing path from /#myRef also clears .ref
TEST-PASS | test_URIs.js | [do_test_mutate_ref : 846] "" == ""
TEST-INFO | test_URIs.js | [do_test_uri_with_hash_suffix : 723] making sure caller is using suffix that starts with '#'
TEST-PASS | test_URIs.js | [do_test_uri_with_hash_suffix : 724] "#" == "#"
TEST-INFO | test_URIs.js | [do_test_uri_with_hash_suffix : 746] testing http://10.32.4.239/ with '#myRef?a=b' appended equals a clone of itself
TEST-INFO | test_URIs.js | [do_check_uri_eq : 600] (uri equals check: 'http://10.32.4.239/#myRef?a=b' == 'http://10.32.4.239/#myRef?a=b')
TEST-PASS | test_URIs.js | [do_check_uri_eq : 601] true == true
TEST-INFO | test_URIs.js | [do_check_uri_eq : 602] (uri equals check: 'http://10.32.4.239/#myRef?a=b' == 'http://10.32.4.239/#myRef?a=b')
{"source_file":"test_URIs.js","test_name":"do_check_uri_eq","l
<<<<<<<
Automation Error: No crash directory (/mnt/sdcard/tests/xpcshell/minidumps) found on remote device
Could not connect; sleeping for 5 seconds.
reconnecting socket
Could not connect; sleeping for 10 seconds.
reconnecting socket
Could not connect; sleeping for 15 seconds.
reconnecting socket
Could not connect; sleeping for 20 seconds.
reconnecting socket
reconnecting socket
Could not connect; sleeping for 5 seconds.
reconnecting socket
Could not connect; sleeping for 10 seconds.
reconnecting socket
Could not connect; sleeping for 15 seconds.
reconnecting socket
Could not connect; sleeping for 20 seconds.
reconnecting socket
INFO | Following exceptions were raised:
Traceback (most recent call last):
  File "/builds/tegra-068/test/build/tests/xpcshell/runxpcshelltests.py", line 166, in run
    self.run_test()
  File "/builds/tegra-068/test/build/tests/xpcshell/runxpcshelltests.py", line 587, in run_test
    head_files, tail_files = self.getHeadAndTailFiles(self.test_object)
  File "xpcshell/remotexpcshelltests.py", line 98, in getHeadAndTailFiles
    return (list(sanitize_list(test['head'], 'head')),
  File "xpcshell/remotexpcshelltests.py", line 92, in sanitize_list
    path))
Exception: head file does not exist: /mnt/sdcard/tests/xpcshell/netwerk/test/unit/head_channels.js

Traceback (most recent call last):
  File "xpcshell/remotexpcshelltests.py", line 610, in <module>
    main()
  File "xpcshell/remotexpcshelltests.py", line 605, in main
    **options.__dict__):
  File "/builds/tegra-068/test/build/tests/xpcshell/runxpcshelltests.py", line 1506, in runTests
    raise exceptions[0]
Exception: head file does not exist: /mnt/sdcard/tests/xpcshell/netwerk/test/unit/head_channels.js
program finished with exit code 1
Logs from recent failures suggest this is simply a long-running test that occasionally takes longer than 300 seconds to complete. That's surprising in light of the test split which happened in bug 969612, to address this very issue.

Logs from successful runs on Android 2.3 often show run times of 240 to 260 seconds.
(In reply to Geoff Brown [:gbrown] from comment #37)
> Logs from recent failures suggest this is simply a long-running test that
> occasionally takes longer than 300 seconds to complete. That's surprising in
> light of the test split which happened in bug 969612, to address this very
> issue.
> 
> Logs from successful runs on Android 2.3 often show run times of 240 to 260
> seconds.

From https://treeherder.mozilla.org/#/jobs?repo=try&revision=4197ebf79d3c...

Android 2.3

11:25:52     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
11:29:21     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 209885ms
11:29:23     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
11:32:00     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 156837ms

13:44:17     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
13:47:40     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 203106ms
13:47:41     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
13:50:16     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 154554ms

13:53:20     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
13:57:14     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 233978ms
13:57:15     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
14:00:20     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 184364ms

Android 4.3

11:16:49     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
11:21:03     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 253662ms
11:21:04     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
11:24:07     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 182755ms

13:29:49     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
13:34:36     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 286521ms
13:34:38     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
13:38:11     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 213637ms

13:28:01     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
13:32:07     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 246089ms
13:32:08     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
13:35:20     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 191460ms


There isn't a lot of room, but I wonder if re-balancing (moving from test_URIs to test_URIs2) would resolve the URIs timeout without causing new ones in URIs2.
Assignee: nobody → gbrown
This just moves a few tests from test_URIs.js to test_URIs2.js.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d55fc46d1e6f shows that the Android 4.3 times for test_URIs and test_URIs2 are now very close:

17:45:01     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
17:49:10     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 248281ms
17:49:11     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
17:53:13     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 241429ms

17:44:09     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
17:48:07     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 238582ms
17:48:09     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
17:52:03     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 234202ms

17:19:44     INFO -  TEST-START | netwerk/test/unit/test_URIs.js
17:22:57     INFO -  TEST-PASS | netwerk/test/unit/test_URIs.js | took 193221ms
17:22:59     INFO -  TEST-START | netwerk/test/unit/test_URIs2.js
17:25:56     INFO -  TEST-PASS | netwerk/test/unit/test_URIs2.js | took 177296ms


Obviously there's still a chance of timeouts, but I think this re-balancing is worth a try. If timeouts persist, we can split into 3 tests or perhaps just increase the timeout.
Attachment #8612356 - Flags: review?(rjesup)
Attachment #8612356 - Flags: review?(rjesup) → review+
https://hg.mozilla.org/mozilla-central/rev/263acdef844e
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.