Closed Bug 1461229 Opened 6 years ago Closed 6 years ago

Intermittent testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll | IOError: Process has been unexpectedly closed (Exit code: 0) (Reason: No data received over socket) [SIGSTKFLT]

Categories

(Testing :: Marionette Client and Harness, defect, P5)

Version 3
ARM
Android
defect

Tracking

(firefox62 affected)

RESOLVED DUPLICATE of bug 1454680
Tracking Status
firefox62 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: crash, intermittent-failure)

Filed by: aciure [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=178254001&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/V_JbyIGiREGjf6ILPq1z0g/runs/0/artifacts/public/logs/live_backing.log

[task 2018-05-13T22:02:04.541Z] 22:02:04     INFO -  TEST-START | testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll
[task 2018-05-13T22:03:07.253Z] 22:03:07  WARNING -  TEST-UNEXPECTED-ERROR | testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll | IOError: Process has been unexpectedly closed (Exit code: 0) (Reason: No data received over socket)
[task 2018-05-13T22:03:07.253Z] 22:03:07     INFO -  Traceback (most recent call last):
[task 2018-05-13T22:03:07.253Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py", line 159, in run
[task 2018-05-13T22:03:07.254Z] 22:03:07     INFO -      testMethod()
[task 2018-05-13T22:03:07.255Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/tests/marionette/tests/testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py", line 400, in test_scroll
[task 2018-05-13T22:03:07.256Z] 22:03:07     INFO -      self.marionette.navigate(long)
[task 2018-05-13T22:03:07.257Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1592, in navigate
[task 2018-05-13T22:03:07.258Z] 22:03:07     INFO -      {"url": url})
[task 2018-05-13T22:03:07.258Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 31, in _
[task 2018-05-13T22:03:07.259Z] 22:03:07     INFO -      m._handle_socket_failure()
[task 2018-05-13T22:03:07.260Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 825, in _handle_socket_failure
[task 2018-05-13T22:03:07.260Z] 22:03:07     INFO -      reraise(IOError, message.format(returncode=returncode, reason=val), tb)
[task 2018-05-13T22:03:07.261Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py", line 26, in _
[task 2018-05-13T22:03:07.261Z] 22:03:07     INFO -      return func(*args, **kwargs)
[task 2018-05-13T22:03:07.262Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py", line 733, in _send_message
[task 2018-05-13T22:03:07.262Z] 22:03:07     INFO -      msg = self.client.request(name, params)
[task 2018-05-13T22:03:07.263Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/transport.py", line 273, in request
[task 2018-05-13T22:03:07.263Z] 22:03:07     INFO -      return self.receive()
[task 2018-05-13T22:03:07.263Z] 22:03:07     INFO -    File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/transport.py", line 159, in receive
[task 2018-05-13T22:03:07.263Z] 22:03:07     INFO -      raise socket.error("No data received over socket")
[task 2018-05-13T22:03:07.263Z] 22:03:07     INFO -  TEST-INFO took 62711ms
[task 2018-05-13T22:03:07.266Z] 22:03:07     INFO -  test_end for testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll logged while not in progress. Logged with data: {"status": "ERROR", "extra": {"class_name": "test_screenshot.TestScreenCaptureContent", "method_name": "test_scroll"}, "expected": "PASS", "test": "testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll", "message": "MarionetteException: Please start a session", "stack": "Traceback (most recent call last):\n  File \"/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_harness/marionette_test/testcases.py\", line 190, in run\n    self.tearDown()\n  File \"/builds/worker/workspace/build/tests/marionette/tests/testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py\", line 279, in tearDown\n    self.close_all_tabs()\n  File \"/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_harness/runner/mixins/window_manager.py\", line 36, in close_all_tabs\n    current_window_handles = self.marionette.window_handles\n  File \"/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py\", line 1385, in window_handles\n    return self._send_message(\"WebDriver:GetWindowHandles\")\n  File \"/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/decorators.py\", line 26, in _\n    return func(*args, **kwargs)\n  File \"/builds/worker/workspace/build/venv/lib/python2.7/site-packages/marionette_driver/marionette.py\", line 730, in _send_message\n    raise errors.MarionetteException(\"Please start a session\")\n"}
The above test job gives a couple of interesting information which might be related to other crashes we see on bug 1454680. But I don't know how to interpret those.

James, could you have a look at the following adb logcat output?

https://taskcluster-artifacts.net/V_JbyIGiREGjf6ILPq1z0g/0/public/test_info//logcat-emulator-5554.log

It all starts with:

> 05-13 15:02:52.548   753   788 W GeckoEventDispatcher: No listener for Website:AppLeft
> 05-13 15:02:54.578   753   756 D dalvikvm: JIT unchain all for threadid=1
> 05-13 15:02:55.328   753   756 W dalvikvm: threadid=2: spin on suspend #1 threadid=1 (pcf=0)
> 05-13 15:02:56.079   753   756 W dalvikvm: threadid=2: spin on suspend #2 threadid=1 (pcf=0)

and then goes on until the end of the file where Fennec gets killed with:

> Fatal signal 16 (SIGSTKFLT) at 0x000002f1 (code=-6), thread 753 (a.fennec_aurora)
Blocks: 1454680
Severity: normal → critical
Flags: needinfo?(snorp)
Keywords: crash
OS: Unspecified → Android
Hardware: Unspecified → ARM
Summary: Intermittent testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll | IOError: Process has been unexpectedly closed (Exit code: 0) (Reason: No data received over socket) → Intermittent testing/marionette/harness/marionette_harness/tests/unit/test_screenshot.py TestScreenCaptureContent.test_scroll | IOError: Process has been unexpectedly closed (Exit code: 0) (Reason: No data received over socket) [SIGSTKFLT]
This looks to me like some kind of internal Android issue. The main thread appears to be stuck trying to queue a message onto its own looper[0]. We can see in the final stack trace that Gecko is also trying to post a message, most likely onto the main thread looper as well.

05-13 15:02:56.099   753   756 I dalvikvm: "main" prio=5 tid=1 RUNNABLE JIT
05-13 15:02:56.099   753   756 I dalvikvm:   | group="main" sCount=1 dsCount=0 obj=0x414c9578 self=0x2a00d090
05-13 15:02:56.099   753   756 I dalvikvm:   | sysTid=753 nice=0 sched=0/0 cgrp=apps handle=1073811452
05-13 15:02:56.109   753   756 I dalvikvm:   | state=R schedstat=( 480340526356 431540710465 204354 ) utm=44662 stm=3372 core=0
05-13 15:02:56.119   753   756 I dalvikvm:   at android.animation.ValueAnimator$AnimationHandler.scheduleAnimation(ValueAnimator.java:-1)
05-13 15:02:56.119   753   756 I dalvikvm:   at android.animation.ValueAnimator$AnimationHandler.doAnimationFrame(ValueAnimator.java:638)
05-13 15:02:56.119   753   756 I dalvikvm:   at android.animation.ValueAnimator$AnimationHandler.run(ValueAnimator.java:646)
05-13 15:02:56.119   753   756 I dalvikvm:   at android.view.Choreographer$CallbackRecord.run(Choreographer.java:749)
05-13 15:02:56.129   753   756 I dalvikvm:   at android.view.Choreographer.doCallbacks(Choreographer.java:562)
05-13 15:02:56.129   753   756 I dalvikvm:   at android.view.Choreographer.doFrame(Choreographer.java:531)
05-13 15:02:56.129   753   756 I dalvikvm:   at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:735)
05-13 15:02:56.139   753   756 I dalvikvm:   at android.os.Handler.handleCallback(Handler.java:730)
05-13 15:02:56.139   753   756 I dalvikvm:   at android.os.Handler.dispatchMessage(Handler.java:92)
05-13 15:02:56.139   753   756 I dalvikvm:   at android.os.Looper.loop(Looper.java:137)
05-13 15:02:56.139   753   756 I dalvikvm:   at android.app.ActivityThread.main(ActivityThread.java:5103)
05-13 15:02:56.139   753   756 I dalvikvm:   at java.lang.reflect.Method.invokeNative(Native Method)
05-13 15:02:56.149   753   756 I dalvikvm:   at java.lang.reflect.Method.invoke(Method.java:525)
05-13 15:02:56.149   753   756 I dalvikvm:   at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:737)
05-13 15:02:56.168   753   756 I dalvikvm:   at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553)
05-13 15:02:56.168   753   756 I dalvikvm:   at dalvik.system.NativeStart.main(Native Method)

05-13 15:03:03.608   753   756 I dalvikvm: "Gecko" prio=5 tid=13 SUSPENDED
05-13 15:03:03.608   753   756 I dalvikvm:   | group="main" sCount=1 dsCount=0 obj=0x41796bd0 self=0x2a1e6fa8
05-13 15:03:03.608   753   756 I dalvikvm:   | sysTid=788 nice=0 sched=0/0 cgrp=apps handle=706638840
05-13 15:03:03.608   753   756 I dalvikvm:   | state=S schedstat=( 1966469658660 1001990283488 381491 ) utm=191001 stm=5645 core=0
05-13 15:03:03.638   753   756 I dalvikvm:   #00  pc 0001c5a4  /system/lib/libc.so (__futex_syscall3+8)
05-13 15:03:03.638   753   756 I dalvikvm:   #01  pc 0000e688  /system/lib/libc.so (__pthread_cond_timedwait_relative+48)
05-13 15:03:03.638   753   756 I dalvikvm:   #02  pc 0000e6e8  /system/lib/libc.so (__pthread_cond_timedwait+64)
05-13 15:03:03.638   753   756 I dalvikvm:   #03  pc 00052e97  /system/lib/libdvm.so
05-13 15:03:03.638   753   756 I dalvikvm:   #04  pc 00053461  /system/lib/libdvm.so (dvmChangeStatus(Thread*, ThreadStatus)+30)
05-13 15:03:03.648   753   756 I dalvikvm:   #05  pc 0004dcb3  /system/lib/libdvm.so (dvmCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+402)
05-13 15:03:03.648   753   756 I dalvikvm:   #06  pc 000385e1  /system/lib/libdvm.so (dvmCheckCallJNIMethod(unsigned int const*, JValue*, Method const*, Thread*)+8)
05-13 15:03:03.648   753   756 I dalvikvm:   #07  pc 00027060  /system/lib/libdvm.so
05-13 15:03:03.648   753   756 I dalvikvm:   #08  pc 0002b580  /system/lib/libdvm.so (dvmInterpret(Thread*, Method const*, JValue*)+184)
05-13 15:03:03.659   753   756 I dalvikvm:   #09  pc 0005fb7b  /system/lib/libdvm.so (dvmCallMethodA(Thread*, Method const*, Object*, bool, JValue*, jvalue const*)+314)
05-13 15:03:03.659   753   756 I dalvikvm:   #10  pc 0004c9c3  /system/lib/libdvm.so
05-13 15:03:03.659   753   756 I dalvikvm:   #11  pc 000406b1  /system/lib/libdvm.so
05-13 15:03:03.659   753   756 I dalvikvm:   #12  pc 017468f1  /data/data/org.mozilla.fennec_aurora/cache/libxul.so
05-13 15:03:03.668   753   756 I dalvikvm:   at android.os.MessageQueue.nativeWake(Native Method)
05-13 15:03:03.668   753   756 I dalvikvm:   at android.os.MessageQueue.enqueueMessage(MessageQueue.java:343)
05-13 15:03:03.668   753   756 I dalvikvm:   at android.os.Handler.enqueueMessage(Handler.java:623)
05-13 15:03:03.668   753   756 I dalvikvm:   at android.os.Handler.sendMessageAtTime(Handler.java:592)
05-13 15:03:03.678   753   756 I dalvikvm:   at android.os.Handler.sendMessageDelayed(Handler.java:563)
05-13 15:03:03.678   753   756 I dalvikvm:   at android.os.Handler.post(Handler.java:323)
05-13 15:03:03.678   753   756 I dalvikvm:   at org.mozilla.gecko.EventDispatcher.dispatchToThread(EventDispatcher.java:341)
05-13 15:03:03.678   753   756 I dalvikvm:   at org.mozilla.gecko.EventDispatcher.dispatchToThreads(EventDispatcher.java:293)
05-13 15:03:03.678   753   756 I dalvikvm:   at org.mozilla.gecko.EventDispatcher.dispatchToThreads(EventDispatcher.java:255)
05-13 15:03:03.688   753   756 I dalvikvm:   at org.mozilla.gecko.mozglue.GeckoLoader.nativeRun(Native Method)
05-13 15:03:03.688   753   756 I dalvikvm:   at org.mozilla.gecko.GeckoThread.run(GeckoThread.java:498)

[0] http://androidxref.com/4.3_r2.1/xref/frameworks/base/core/java/android/animation/ValueAnimator.java#649
Flags: needinfo?(snorp)
So I had a look what SIGSTKFLT means and it is:

> SIGSTKFLT    -,16,-     Term    Stack fault on coprocessor (unused)

Could this be the result of the above mentioned scenario? It looks plausible. Is there anything we can do? Reporting to the Android bug database?
Flags: needinfo?(snorp)
(In reply to Henrik Skupin (:whimboo) from comment #4)
> So I had a look what SIGSTKFLT means and it is:
> 
> > SIGSTKFLT    -,16,-     Term    Stack fault on coprocessor (unused)
> 
> Could this be the result of the above mentioned scenario? It looks
> plausible. Is there anything we can do? Reporting to the Android bug
> database?

In this case Dalvik is sending this to itself because of the hang. From the logcat:

05-13 15:03:04.408   753   756 D dalvikvm: threadid=2: sending two SIGSTKFLTs to threadid=1 (tid=753) to cause debuggerd dump
Flags: needinfo?(snorp)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Product: Testing → Remote Protocol
Moving bug to Testing::Marionette Client and Harness component per bug 1815831.
Component: Marionette → Marionette Client and Harness
Product: Remote Protocol → Testing
You need to log in before you can comment on or make changes to this bug.