Closed Bug 1659965 Opened 4 years ago Closed 4 years ago

Intermittent [Tier2] Linux CCOV TEST-UNEXPECTED-FAIL | automation.py | application terminated with exit code -11

Categories

(Testing :: Code Coverage, defect, P5)

defect

Tracking

(firefox-esr68 unaffected, firefox-esr78 unaffected, firefox79 unaffected, firefox80 unaffected, firefox81 fixed)

RESOLVED FIXED
81 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox79 --- unaffected
firefox80 --- unaffected
firefox81 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: aosmond)

References

(Regression)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disable-recommended])

Attachments

(1 file)

Filed by: nbeleuzu [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=313428139&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RIdID35dSmeDZJGCTau7YQ/runs/0/artifacts/public/logs/live_backing.log


[task 2020-08-19T11:17:50.782Z] 11:17:50 INFO - runtests.py | Running tests: start.
[task 2020-08-19T11:17:50.782Z] 11:17:50 INFO -
[task 2020-08-19T11:17:50.797Z] 11:17:50 INFO - Application command: /builds/worker/workspace/build/application/firefox/firefox -marionette -foreground -profile /tmp/tmpsGC8bF.mozrunner
[task 2020-08-19T11:17:50.797Z] 11:17:50 INFO - runtests.py | Application pid: 1517
[task 2020-08-19T11:17:50.797Z] 11:17:50 INFO - TEST-INFO | started process GECKO(1517)
[task 2020-08-19T11:20:50.887Z] 11:20:50 INFO - runtests.py | Waiting for browser...
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - TEST-INFO | Main app process: killed by SIGSEGV
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - Buffered messages finished
[task 2020-08-19T11:20:50.889Z] 11:20:50 ERROR - TEST-UNEXPECTED-FAIL | automation.py | application terminated with exit code -11
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - runtests.py | Application ran for: 0:03:00.090588
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - zombiecheck | Reading PID log: /tmp/tmprh5K__pidlog
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - Traceback (most recent call last):
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2979, in doTests
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - e10s=options.e10s
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2442, in runApp
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - six.reraise(exc, value, tb)
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/tests/mochitest/runtests.py", line 2355, in runApp
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - self.marionette.start_session()
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 36, in _
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - m._handle_socket_failure()
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 654, in _handle_socket_failure
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - reraise(exc_cls, exc, tb)
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 26, in _
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - return func(*args, **kwargs)
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1112, in start_session
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - self.raise_for_port(timeout=timeout)
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 573, in raise_for_port
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - self.host, self.port))
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - timeout: Timed out waiting for connection on 127.0.0.1:2828!
[task 2020-08-19T11:20:50.889Z] 11:20:50 ERROR - Automation Error: Received unexpected exception while running application
[task 2020-08-19T11:20:50.889Z] 11:20:50 ERROR -
[task 2020-08-19T11:20:50.889Z] 11:20:50 INFO - Stopping web server
[task 2020-08-19T11:20:50.896Z] 11:20:50 INFO - Stopping web socket server
[task 2020-08-19T11:20:50.917Z] 11:20:50 INFO - Stopping ssltunnel
[task 2020-08-19T11:20:50.944Z] 11:20:50 INFO - websocket/process bridge listening on port 8191
[task 2020-08-19T11:20:50.972Z] 11:20:50 INFO - Stopping websocket/process bridge
[task 2020-08-19T11:20:50.972Z] 11:20:50 WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2020-08-19T11:20:50.973Z] 11:20:50 INFO - runtests.py | Running tests: end.
[task 2020-08-19T11:20:50.973Z] 11:20:50 INFO - Buffered messages finished

Flags: needinfo?(mcastelluccio)

Is it possible to backfill these jobs on autoland to get a smaller regression range?

Flags: needinfo?(sheriffs)

We (sheriffs) plan to check if the issue affects the next Linux ccov build and backfill in that case. If the build and tests succeed, shall this issue still be investigated (rerunning the failed build + a test task and backfilling if the rerun also failed)?

Flags: needinfo?(sheriffs) → needinfo?(dmajor)

If the jobs go back to green then I wouldn't worry about it IMO.

Flags: needinfo?(dmajor)

Looks like the failures are a fallout from Bug 1658847 - https://hg.mozilla.org/integration/autoland/rev/4a008bd49ea4c445c9b3bfe947e4288427daf342
Here are the backfills.
Andrew, could you have a look over it as Linux ccov tests are failing massively because of this? TH link. Thank you.

Flags: needinfo?(mcastelluccio) → needinfo?(aosmond)
Regressed by: 1658847

Set release status flags based on info from the regressing bug 1658847

I suspect the profiler needs to be initialized before code coverage.

Assignee: nobody → aosmond
Flags: needinfo?(aosmond)

CodeCoverageHandler relies upon CrossProcessMutex. On Linux this is
implemented using shared memory. In bug 1658847, we used to the
profiler's "thread sleep" mechanism to resolve the signal interrupts in
posix_fallocate, which requires the profiler to be initialized. As such,
CodeCoverageHandler now needs to be initialized after.

Pushed by malexandru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a341ca5f7ec5 Initialize the profiler before CodeCoverageHandler. r=mstange
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: