Closed Bug 1540036 Opened 6 years ago Closed 6 years ago

Intermittent Tier 2 Android 8 <random-test> | java-exception Main thread (2) stack: android.os.MessageQueue.nativePollOnce(Native Method)

Categories

(Core :: Audio/Video, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox-esr60 --- unaffected
firefox66 --- unaffected
firefox67 --- unaffected
firefox68 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jhlin)

References

(Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [stockwell fixed])

Crash Data

Attachments

(2 files)

#[markdown(off)]
Filed by: btara [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=236801755&repo=autoland

https://queue.taskcluster.net/v1/task/bRPfkv9fSHWyTi6bRbG-PQ/runs/0/artifacts/public/logs/live_backing.log

03:57:05 INFO - 375 INFO TEST-START | dom/media/webspeech/recognition/test/test_timeout.html
03:57:15 INFO - 376 INFO TEST-OK | dom/media/webspeech/recognition/test/test_timeout.html | took 10358ms
03:57:15 INFO - 377 INFO TEST-START | Shutdown
03:57:15 INFO - 378 INFO Passed: 56
03:57:15 INFO - 379 INFO Failed: 0
03:57:15 INFO - 380 INFO Todo: 0
03:57:15 INFO - 381 INFO Mode: non-e10s
03:57:15 INFO - 382 INFO Slowest: 10359ms - /tests/dom/media/webspeech/recognition/test/test_timeout.html
03:57:15 INFO - 383 INFO SimpleTest FINISHED
03:57:17 INFO - wait for org.mozilla.fennec_aurora complete; top activity=com.bitbar.testdroid.monitor
03:57:17 INFO - remoteautomation.py | Application ran for: 0:00:36.332150
03:57:17 WARNING - PROCESS-CRASH | dom/media/webspeech/recognition/test/test_timeout.html | java-exception Main thread (2) stack: android.os.MessageQueue.nativePollOnce(Native Method)
03:57:17 INFO - Stopping web server
03:57:17 INFO - Stopping web socket server
03:57:17 INFO - Stopping ssltunnel
03:57:17 INFO - websocket/process bridge listening on port 8191
03:57:17 INFO - Stopping websocket/process bridge
03:57:17 INFO - leakcheck | refcount logging is off, so leaks can't be detected!
03:57:17 INFO - runtests.py | Running tests: end.
03:57:19 INFO - Buffered messages finished
03:57:19 INFO - Running manifest: dom/media/webspeech/synth/test/mochitest.ini
03:57:20 INFO - adb Ignoring attempt to chmod external storage
03:57:33 INFO - pk12util: PKCS12 IMPORT SUCCESSFUL
03:57:37 INFO - MochitestServer : launching [u'/builds/worker/workspace/build/hostutils/host-utils-67.0a1.en-US.linux-x86_64/xpcshell', '-g', '/builds/worker/workspace/build/hostutils/host-utils-67.0a1.en-US.linux-x86_64', '-f', '/builds/worker/workspace/build/hostutils/host-utils-67.0a1.en-US.linux-x86_64/components/httpd.js', '-e', "const _PROFILE_PATH = '/tmp/tmp7TOX3i.mozrunner'; const _SERVER_PORT = '8854'; const _SERVER_ADDR = '10.7.205.210'; const _TEST_PREFIX = undefined; const _DISPLAY_RESULTS = false;", '-f', '/builds/worker/workspace/build/tests/mochitest/server.js']
03:57:37 INFO - runtests.py | Server pid: 1190
03:57:37 INFO - runtests.py | Websocket server pid: 1193
03:57:37 INFO - runtests.py | websocket/process bridge pid: 1199
03:57:37 INFO - runtests.py | SSL tunnel pid: 1215
03:57:38 INFO - adb Ignoring attempt to chmod external storage
03:57:38 INFO - runtests.py | Running with scheme: http
03:57:38 INFO - runtests.py | Running with e10s: False
03:57:38 INFO - runtests.py | Running with serviceworker_e10s: False
03:57:38 INFO - runtests.py | Running with socketprocess_e10s: False
03:57:38 INFO - runtests.py | Running tests: start.
03:57:38 INFO - adb Granting important runtime permissions to org.mozilla.fennec_aurora
03:57:41 INFO - adb launch_application: am start -W -n org.mozilla.fennec_aurora/org.mozilla.gecko.BrowserApp -a android.intent.action.VIEW --es env9 MOZ_CRASHREPORTER_NO_REPORT=1 --es env8 MOZ_UPLOAD_DIR=/sdcard/tests/mozlog --es args "-no-remote -profile /sdcard/tests/profile//" --es env3 DISABLE_UNSAFE_CPOW_WARNINGS=1 --es env2 R_LOG_VERBOSE=1 --es env1 XPCOM_DEBUG_BREAK=stack --es env0 MOZ_CRASHREPORTER=1 --es env7 R_LOG_DESTINATION=stderr --es env6 MOZ_CRASHREPORTER_SHUTDOWN=1 --es env5 MOZ_IN_AUTOMATION=1 --es env4 MOZ_DISABLE_NONLOCAL_CONNECTIONS=1 --es env11 MOZ_HIDE_RESULTS_TABLE=1 --es env10 R_LOG_LEVEL=6 -d "http://mochi.test:8888/tests?autorun=1&closeWhenDone=1&logFile=%2Fsdcard%2Ftests%2Flogs%2Fmochitest.log&fileLevel=INFO&consoleLevel=INFO&hideResultsTable=1&manifestFile=tests.json&dumpOutputDirectory=%2Fsdcard%2Ftests"
03:57:41 INFO - remoteautomation.py | Application pid: 7589
...
03:58:02 INFO - 403 INFO TEST-START | dom/media/webspeech/synth/test/test_speech_simple.html
03:58:02 INFO - 404 INFO TEST-OK | dom/media/webspeech/synth/test/test_speech_simple.html | took 465ms
03:58:02 INFO - 405 INFO TEST-START | Shutdown
03:58:02 INFO - 406 INFO Passed: 322
03:58:02 INFO - 407 INFO Failed: 0
03:58:02 INFO - 408 INFO Todo: 0
03:58:02 INFO - 409 INFO Mode: non-e10s
03:58:02 INFO - 410 INFO Slowest: 2498ms - /tests/dom/media/webspeech/synth/test/test_indirect_service_events.html
03:58:02 INFO - 411 INFO SimpleTest FINISHED
03:58:04 INFO - wait for org.mozilla.fennec_aurora complete; top activity=com.bitbar.testdroid.monitor
03:58:04 INFO - remoteautomation.py | Application ran for: 0:00:26.251158
03:58:04 WARNING - PROCESS-CRASH | dom/media/webspeech/synth/test/test_speech_simple.html | java-exception Main thread (2) stack: android.os.MessageQueue.nativePollOnce(Native Method)
03:58:04 INFO - Stopping web server
03:58:04 INFO - Stopping web socket server
03:58:04 INFO - Stopping ssltunnel
03:58:04 INFO - websocket/process bridge listening on port 8191
03:58:04 INFO - Stopping websocket/process bridge
03:58:04 INFO - leakcheck | refcount logging is off, so leaks can't be detected!
03:58:04 INFO - runtests.py | Running tests: end.
03:58:06 INFO - Buffered messages finished
03:58:07 INFO - 0 INFO TEST-START | Shutdown
03:58:07 INFO - 1 INFO Passed: 631
03:58:07 INFO - 2 INFO Failed: 0
03:58:07 INFO - 3 INFO Todo: 1
03:58:07 INFO - 4 INFO Mode: non-e10s
03:58:07 INFO - 5 INFO SimpleTest FINISHED

Flags: needinfo?(jolin)

03-28 20:55:57.802 6572 6949 W System.err: java.io.IOException: java.lang.NullPointerException: Attempt to invoke virtual method 'long org.mozilla.gecko.mozglue.SharedMemory.getPointer()' on a null object reference
03-28 20:55:57.802 6572 6949 W System.err: at org.mozilla.gecko.media.SampleBuffer.writeToByteBuffer(SampleBuffer.java:78)
03-28 20:55:57.802 6572 6949 W System.err: Caused by: java.lang.NullPointerException: Attempt to invoke virtual method 'long org.mozilla.gecko.mozglue.SharedMemory.getPointer()' on a null object reference
03-28 20:55:57.802 6572 6949 W System.err: at org.mozilla.gecko.media.SampleBuffer.writeToByteBuffer(SampleBuffer.java:76)
03-28 20:55:57.802 6572 6949 E GeckoCrashHandler: >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 118 ("Thread-51")

Assignee: nobody → jolin
Flags: needinfo?(jolin)
Summary: Intermittent Tier 2 Android 8 dom/media/webspeech/recognition/test/test_timeout.html OR test_speech_simple.html | java-exception Main thread (2) stack: android.os.MessageQueue.nativePollOnce(Native Method) → Intermittent Tier 2 Android 8 <random-test> | java-exception Main thread (2) stack: android.os.MessageQueue.nativePollOnce(Native Method)

This is still failing at a really high rate. In the last 7 days we have 260 failures, all on android-hw-p2-8-0-arm7-api-16: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-03-28&endday=2019-04-04&tree=all&bug=1540036

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=238047651&repo=mozilla-central&lineNumber=3240

John, any updates on this? This now on the disable-recommended bugs lists.

Flags: needinfo?(jolin)
Crash Signature: [@ mozilla::java::SampleBuffer::WriteToByteBuffer]

When remote codec executes flush(), it throws away existing buffers:

  • memorized buffers are no longer valid and need to be forgotten
  • samples returned before flush() will have null buffers and should
    be released back to remote codec immediately.

HandleOutput() runs on Android binder thread pool and could be preempted
by RemoteDateDecoder task queue. That means ProcessOutput() could be scheduled
after ProcessShutdown() or ProcessFlush(). When that happens, aBuffer is no
long valid and should never be processed.
Also assert preconditions of buffers received from Java callback.

Depends on D26188

Blocks: 1540025

(In reply to Cosmin Sabou [:CosminS] from comment #19)

This is still failing at a really high rate. In the last 7 days we have 260 failures, all on android-hw-p2-8-0-arm7-api-16: https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2019-03-28&endday=2019-04-04&tree=all&bug=1540036

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=238047651&repo=mozilla-central&lineNumber=3240

John, any updates on this? This now on the disable-recommended bugs lists.

Thanks for checking in.

Just uploaded the patches for review. I've run the tests in dom/media/test repeatedly on a Pixel 2 with a local build and has not met the crash so far.

Flags: needinfo?(jolin)
Pushed by jolin@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/4b2b10988c9e p1: Handle buffer change for flush(). r=jya https://hg.mozilla.org/integration/autoland/rev/6d3077516c70 p2: Check buffer and codec state before processing buffers. r=jya
Whiteboard: [stockwell disable-recommended] → [stockwell fixed]
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Regressions: 1560611
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: