Perma Linux x64 QR debug W-fis(wpt5) fix-stacks error: failed to read breakpad symbols dir `/builds/worker/workspace/build/symbols/libpthread.so.0` for `/lib/x86_64-linux-gnu/libpthread.so.0`
Categories
(Toolkit :: Crash Reporting, defect, P5)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox74 | --- | unaffected |
firefox75 | --- | unaffected |
firefox76 | --- | unaffected |
firefox77 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: n.nethercote)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Attachments
(1 file)
Filed by: archaeopteryx [at] coole-files.de
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=296760839&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Vmhke1sDT1677ACYixLypg/runs/0/artifacts/public/logs/live_backing.log
This shows up in at least Linux x64 18.04 Quantumrender with wpt5 fission enabled since bug 1622701 landed.
[task 2020-04-08T08:00:50.172Z] 08:00:50 INFO - TEST-TIMEOUT | /printing/print-microtask-after-navigate.html | took 30791ms
[task 2020-04-08T08:00:50.236Z] 08:00:50 INFO - PID 12487 | 1586332850227 Marionette INFO Stopped listening on port 59336
[task 2020-04-08T08:00:50.821Z] 08:00:50 INFO - PID 12487 | [GPU 12584, Compositor] WARNING: Possibly dropping task posted to updater thread: file /builds/worker/checkouts/gecko/gfx/layers/apz/src/APZUpdater.cpp, line 371
[task 2020-04-08T08:00:50.889Z] 08:00:50 INFO - PID 12487 | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2020-04-08T08:00:50.966Z] 08:00:50 INFO - PID 12487 | [2020-04-08T08:00:50Z WARN xulstore::persist] tried to remove key that isn't in the store
[task 2020-04-08T08:00:50.966Z] 08:00:50 INFO - PID 12487 | [2020-04-08T08:00:50Z WARN xulstore::persist] tried to remove key that isn't in the store
[task 2020-04-08T08:00:51.043Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/netwerk/base/nsNetUtil.cpp, line 2307
[task 2020-04-08T08:00:51.044Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 398
[task 2020-04-08T08:00:51.045Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 420
[task 2020-04-08T08:00:51.045Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/netwerk/base/nsNetUtil.cpp, line 2307
[task 2020-04-08T08:00:51.045Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 398
[task 2020-04-08T08:00:51.046Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 420
[task 2020-04-08T08:00:51.046Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/netwerk/base/nsNetUtil.cpp, line 2307
[task 2020-04-08T08:00:51.047Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 398
[task 2020-04-08T08:00:51.047Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 420
[task 2020-04-08T08:00:51.048Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/netwerk/base/nsNetUtil.cpp, line 2307
[task 2020-04-08T08:00:51.048Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 398
[task 2020-04-08T08:00:51.049Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/worker/checkouts/gecko/caps/ContentPrincipal.cpp, line 420
[task 2020-04-08T08:00:51.050Z] 08:00:51 INFO - PID 12487 | [Child 12655, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp, line 3352
[task 2020-04-08T08:00:51.071Z] 08:00:51 INFO - PID 12487 | nsStringStats
[task 2020-04-08T08:00:51.072Z] 08:00:51 INFO - PID 12487 | => mAllocCount: 24334
[task 2020-04-08T08:00:51.072Z] 08:00:51 INFO - PID 12487 | => mReallocCount: 0
[task 2020-04-08T08:00:51.073Z] 08:00:51 INFO - PID 12487 | => mFreeCount: 24334
[task 2020-04-08T08:00:51.074Z] 08:00:51 INFO - PID 12487 | => mShareCount: 11651
[task 2020-04-08T08:00:51.074Z] 08:00:51 INFO - PID 12487 | => mAdoptCount: 447
[task 2020-04-08T08:00:51.075Z] 08:00:51 INFO - PID 12487 | => mAdoptFreeCount: 479
[task 2020-04-08T08:00:51.075Z] 08:00:51 INFO - PID 12487 | => Process ID: 12655, Thread ID: 139705266898816
[task 2020-04-08T08:01:05.852Z] 08:01:05 INFO - PID 12487 | console.error: (new Error("SessionFile is closed", "resource:///modules/sessionstore/SessionFile.jsm", 433))
[task 2020-04-08T08:01:54.634Z] 08:01:54 INFO - PID 12487 | XPCOM_MEM_BLOAT_LOG: /tmp/tmpsvOl6e/runtests_leaks_986.log
[task 2020-04-08T08:01:54.634Z] 08:01:54 INFO - PID 12487 | Writing to log: /tmp/tmpsvOl6e/runtests_leaks_986.log
[task 2020-04-08T08:01:54.635Z] 08:01:54 INFO - PID 12487 | Hit MOZ_CRASH(Shutdown hanging before starting.) at /builds/worker/checkouts/gecko/toolkit/components/terminator/nsTerminator.cpp:203
[task 2020-04-08T08:01:54.651Z] 08:01:54 INFO - STDOUT: Initializing stack-fixing for the first stack frame, this may take a while...
[task 2020-04-08T08:01:54.679Z] 08:01:54 INFO - PID 12487 | #01: _pt_root [hg:hg.mozilla.org/integration/autoland:nsprpub/pr/src/pthreads/ptthread.c:fc7f46174390746803126bbd7c70c9451f8051c0:204]
[task 2020-04-08T08:01:54.679Z] 08:01:54 CRITICAL - Test harness output was not a valid structured log message:
[task 2020-04-08T08:01:54.679Z] 08:01:54 CRITICAL - fix-stacks error: failed to read breakpad symbols dir /builds/worker/workspace/build/symbols/libpthread.so.0
for /lib/x86_64-linux-gnu/libpthread.so.0
[task 2020-04-08T08:01:54.680Z] 08:01:54 CRITICAL - fix-stacks note: this is expected and harmless for system libraries on debug automation runs
[task 2020-04-08T08:01:54.680Z] 08:01:54 INFO - PID 12487 | #02: ??? [/lib/x86_64-linux-gnu/libpthread.so.0 + 0x76db]
[task 2020-04-08T08:01:54.680Z] 08:01:54 CRITICAL - Test harness output was not a valid structured log message:
[task 2020-04-08T08:01:54.680Z] 08:01:54 CRITICAL - fix-stacks error: failed to read breakpad symbols dir /builds/worker/workspace/build/symbols/libc.so.6
for /lib/x86_64-linux-gnu/libc.so.6
[task 2020-04-08T08:01:54.680Z] 08:01:54 CRITICAL - fix-stacks note: this is expected and harmless for system libraries on debug automation runs
[task 2020-04-08T08:01:54.681Z] 08:01:54 INFO - PID 12487 | #03: clone [/lib/x86_64-linux-gnu/libc.so.6 + 0x12188f]
[task 2020-04-08T08:01:54.681Z] 08:01:54 INFO - PID 12487 | #04: ??? (???:???)
[task 2020-04-08T08:01:54.681Z] 08:01:54 INFO - PID 12487 | ExceptionHandler::GenerateDump cloned child 13610
[task 2020-04-08T08:01:54.681Z] 08:01:54 INFO - PID 12487 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2020-04-08T08:01:54.682Z] 08:01:54 INFO - PID 12487 | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2020-04-08T08:01:54.780Z] 08:01:54 INFO - PID 12487 | Exiting due to channel error.
[task 2020-04-08T08:01:54.782Z] 08:01:54 INFO - PID 12487 | Exiting due to channel error.
[task 2020-04-08T08:01:54.783Z] 08:01:54 INFO - PID 12487 | [GPU 12584, Main Thread] WARNING: Shutting down GPU process early due to a crash!: file /builds/worker/checkouts/gecko/gfx/ipc/GPUParent.cpp, line 519
[task 2020-04-08T08:01:54.783Z] 08:01:54 INFO - PID 12487 | Exiting due to channel error.
[task 2020-04-08T08:01:54.848Z] 08:01:54 INFO - Browser exited with return code 11
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
First of all, this is a tier 2 job, and the job mentioned in comment 0 actually still shows up as green ("Result: success") on Treeherder. So I'm not sure why this bug was filed.
Here is the relevant part of the output:
[task 2020-04-08T08:01:54.679Z] 08:01:54 CRITICAL - Test harness output was not a valid structured log message:
[task 2020-04-08T08:01:54.679Z] 08:01:54 CRITICAL - fix-stacks error: failed to read breakpad symbols dir `/builds/worker/workspace/build/symbols/libpthread.so.0` for `/lib/x86_64-linux-gnu/libpthread.so.0`
[task 2020-04-08T08:01:54.680Z] 08:01:54 CRITICAL - fix-stacks note: this is expected and harmless for system libraries on debug automation runs
As the third line indicates, this "error" isn't a problem, but it is printed to stderr and is apparently in a format that the web platform test harness doesn't like. Note also that this only happens when we've already crashed, so there would have been a subsequent failure anyway.
James, can you give me any pointers on "valid structure log messages" w.r.t. web platform tests? I'm not sure the best way to deal with this. I'd prefer to keep emitting the error to stderr if possible, because it can be useful.
Comment 2•5 years ago
|
||
The wpt harness wraps stdio of child processes and re-logs that output as process_output
mozlog messages [1]. This ends up in the raw log as JSON data which in the CI configuration is then written to stdout. That output ends up being parsed in mozharness in order to do things like count the failures and set the job status. If some other subprocess we don't know about starts writing to std[out|err] in a different format we are unable to process that output and end up with this error. In general having multiple processes writing to the same ouput stream seems quite likely to end up with corrupt output, so if we can find a way to avoid that it would be good. If this is all going through mozrunner [2], maybe we can have a way to pass in an output handler function to fixSymbols
so we can process it like we do for Firefox output (c.f. [3])
[1] https://firefox-source-docs.mozilla.org/mozbase/mozlog.html#data-format
[2] https://searchfox.org/mozilla-central/source/testing/mozbase/mozrunner/mozrunner/utils.py#227
[3] https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/browsers/firefox.py#299
Comment 3•5 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #1)
So I'm not sure why this bug was filed.
To get rid of it to prevent confusion if the task actually fails and has these additional failure lines.
Assignee | ||
Comment 4•5 years ago
|
||
Thank you both for the explanations!
I think the best path forward here is to augment fixSymbols
with an optional argument: quiet=False
. When set to true, it will suppress errors being printed to stderr. The web platform test harness can then set quiet=True
.
I'm in the middle of another change relating to fix-stacks
in bug 1628527, so this will have to wait on that.
Assignee | ||
Comment 5•5 years ago
|
||
Because the web platform test harness doesn't recognize them and annotates them
with CRITICAL
, which is disconcerting, even though those errors are typically
not harmful.
Updated•5 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 8•5 years ago
|
||
jgraham: review ping!
Comment 11•5 years ago
|
||
Backed out for failures caused by unexpected keyword argument hide_error
backout: https://hg.mozilla.org/integration/autoland/rev/7b78aab460c6c5dc5c942d081ce1b729826ae07b
failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=299764020&repo=autoland&lineNumber=1522
[task 2020-04-28T11:23:44.858Z] 11:23:44 INFO - REFTEST TEST-START | layout/reftests/css-grid/grid-row-gap-003.html == layout/reftests/css-grid/grid-row-gap-003-ref.html
[task 2020-04-28T11:23:44.858Z] 11:23:44 INFO - Automation Error: Exception caught while running tests
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - Traceback (most recent call last):
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/remotereftest.py", line 413, in run_test_harness
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - retVal = reftest.runTests(options.tests, options)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/runreftest.py", line 616, in runTests
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - return self.runSerialTests(manifests, options, cmdargs)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/runreftest.py", line 956, in runSerialTests
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - return run()
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/runreftest.py", line 947, in run
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - **kwargs)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/remotereftest.py", line 372, in runApp
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - e10s=options.e10s)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/remoteautomation.py", line 82, in runApp
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - status = self.waitForFinish(timeout, maxTime)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/remoteautomation.py", line 132, in waitForFinish
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - status = self.wait(timeout=maxTime, noOutputTimeout=timeout)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/remoteautomation.py", line 362, in wait
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - hasOutput = self.read_stdout()
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/remoteautomation.py", line 308, in read_stdout
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - parsed_messages = self.messageLogger.write(line)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/output.py", line 146, in call
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - self.verbatim(line)
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - File "/builds/worker/workspace/build/tests/reftest/output.py", line 162, in verbatim
[task 2020-04-28T11:23:44.860Z] 11:23:44 INFO - line = self.stack_fixer_function(line)
[task 2020-04-28T11:23:44.861Z] 11:23:44 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozrunner/utils.py", line 257, in stack_fixer_function
[task 2020-04-28T11:23:44.861Z] 11:23:44 INFO - line, slowWarning=True, breakpadSymsDir=symbolsPath, hide_errors=hideErrors)
[task 2020-04-28T11:23:44.861Z] 11:23:44 INFO - TypeError: fixSymbols() got an unexpected keyword argument 'hide_errors'
Assignee | ||
Comment 14•5 years ago
|
||
Argh, this is another problem caused by Android host utils. I'll have to land the host utils part first, then get the host utils updated, then land the non-host utils part.
Assignee | ||
Comment 15•5 years ago
|
||
Bug 1633901 has landed with the fix_stacks.py
change. aerickson, sorry to do this to you again but could you update the Android host utils from revision https://hg.mozilla.org/mozilla-central/rev/6bb8423186c1 or later? Once that's done, I can land the attached patch.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 17•5 years ago
|
||
The Android host utils change has landed and try looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa2becbbe29b77be5722738b898e78fdbf143655
Comment 18•5 years ago
|
||
Comment 20•5 years ago
|
||
bugherder |
Description
•