Closed Bug 1714615 Opened 2 years ago Closed 1 year ago

High frequency FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (52.85.132.97) was made.

Categories

(Firefox :: Remote Settings Client, defect, P5)

defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox-esr102 --- fixed
firefox103 --- wontfix
firefox104 --- wontfix
firefox105 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: leplatrem)

References

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disabled][retriggered])

Attachments

(2 files)

Filed by: abutkovits [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=341771896&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/MrOkzgg3ROCCOUdX299fYQ/runs/0/artifacts/public/logs/live_backing.log


[task 2021-06-04T15:41:12.064Z] 15:41:12     INFO -  TEST-PASS | netwerk/test/unit/test_http2-proxy.js | proxy_bad_gateway_failure_isolated - [proxy_bad_gateway_failure_isolated : 609] "undefined" == "undefined"
[task 2021-06-04T15:41:12.064Z] 15:41:12     INFO -  TEST-PASS | netwerk/test/unit/test_http2-proxy.js | proxy_bad_gateway_failure_isolated - [proxy_bad_gateway_failure_isolated : 610] No new session created by 502 - 2 == 2
[task 2021-06-04T15:41:12.064Z] 15:41:12     INFO -  (xpcshell/head.js) | test run_next_test 12 pending (2)
[task 2021-06-04T15:41:12.064Z] 15:41:12     INFO -  (xpcshell/head.js) | test proxy_bad_gateway_failure_isolated finished (2)
[task 2021-06-04T15:41:12.064Z] 15:41:12     INFO -  (xpcshell/head.js) | test run_next_test 12 finished (1)
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  exiting test
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 | FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (52.85.132.97) was made.
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 | You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 | AddressSanitizer:DEADLYSIGNAL
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 | =================================================================
[task 2021-06-04T15:41:12.065Z] 15:41:12    ERROR -  PID 22665 | ==22665==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000001 (pc 0x7f8509ae6343 bp 0x7f84f524bff0 sp 0x7f84f524b820 T4)
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 | ==22665==The signal is caused by a WRITE memory access.
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 | ==22665==Hint: address points to the zero page.
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 |     #0 0x7f8509ae6343 in mozilla::net::nsSocketTransport::InitiateSocket() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1276:7
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 |     #1 0x7f8509aeb5a9 in mozilla::net::nsSocketTransport::OnSocketEvent(unsigned int, nsresult, nsISupports*) /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:2024:22
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 |     #2 0x7f8509b30cca in mozilla::net::nsSocketEvent::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:94:17
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 |     #3 0x7f85097a6fea in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1153:16
[task 2021-06-04T15:41:12.065Z] 15:41:12     INFO -  PID 22665 |     #4 0x7f85097b14fc in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:548:10
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #5 0x7f8509afbd41 in mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp:1204:11
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #6 0x7f8509afd9ec in non-virtual thunk to mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #7 0x7f85097a6fea in nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1153:16
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #8 0x7f85097b14fc in NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:548:10
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #9 0x7f850a8404b8 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:300:20
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #10 0x7f850a769981 in RunInternal /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:335:10
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #11 0x7f850a769981 in RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:328:3
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #12 0x7f850a769981 in MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:310:3
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #13 0x7f85097a0ab2 in nsThread::ThreadFunc(void*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:395:10
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #14 0x7f852107345e in _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #15 0x7f8520c7d6da in start_thread /build/glibc-2ORdQG/glibc-2.27/nptl/pthread_create.c:463
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 |     #16 0x7f8502fa4a3e in clone /build/glibc-2ORdQG/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 | AddressSanitizer can not provide additional info.
[task 2021-06-04T15:41:12.066Z] 15:41:12     INFO -  PID 22665 | SUMMARY: AddressSanitizer: SEGV /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1276:7 in mozilla::net::nsSocketTransport::InitiateSocket()
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

Recent failures are mochitest bc. THere are 44 total failures in the last 7 days on linux1804-64-tsan-qr opt
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=366887271&repo=mozilla-central&lineNumber=3281

task 2022-02-06T21:52:50.713Z] 21:52:50     INFO - TEST-START | browser/components/newtab/test/browser/browser_topsites_section.js
[task 2022-02-06T21:52:54.524Z] 21:52:54     INFO - GECKO(1971) | FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (99.84.70.99) was made.
[task 2022-02-06T21:52:54.524Z] 21:52:54     INFO - GECKO(1971) | You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
[task 2022-02-06T21:52:54.524Z] 21:52:54     INFO - GECKO(1971) | ThreadSanitizer:DEADLYSIGNAL
[task 2022-02-06T21:52:54.524Z] 21:52:54     INFO - GECKO(1971) | ==1971==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f9ec96f3b84 bp 0x7b4c00002f40 sp 0x7f9e3ef107c0 T1983)
[task 2022-02-06T21:52:54.524Z] 21:52:54     INFO - GECKO(1971) | ==1971==The signal is caused by a WRITE memory access.
[task 2022-02-06T21:52:54.524Z] 21:52:54     INFO - GECKO(1971) | ==1971==Hint: address points to the zero page.
[task 2022-02-06T21:52:54.581Z] 21:52:54     INFO - GECKO(1971) |     #0 mozilla::net::nsSocketTransport::InitiateSocket() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1237:7 (libxul.so+0x119cb84)
[task 2022-02-06T21:52:54.581Z] 21:52:54     INFO - GECKO(1971) |     #1 mozilla::net::nsSocketTransport::OnSocketEvent(unsigned int, nsresult, nsISupports*) /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp (libxul.so+0x11a0029)
[task 2022-02-06T21:52:54.583Z] 21:52:54     INFO - GECKO(1971) |     #2 mozilla::net::nsSocketEvent::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:94:17 (libxul.so+0x11cb406)
[task 2022-02-06T21:52:54.584Z] 21:52:54     INFO - GECKO(1971) |     #3 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1189:16 (libxul.so+0xfb0923)
[task 2022-02-06T21:52:54.586Z] 21:52:54     INFO - GECKO(1971) |     #4 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:467:10 (libxul.so+0xfb7342)
[task 2022-02-06T21:52:54.587Z] 21:52:54     INFO - GECKO(1971) |     #5 mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp:1190:11 (libxul.so+0x11ab917)
[task 2022-02-06T21:52:54.587Z] 21:52:54     INFO - GECKO(1971) |     #6 non-virtual thunk to mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp (libxul.so+0x11acd79)
[task 2022-02-06T21:52:54.588Z] 21:52:54     INFO - GECKO(1971) |     #7 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1189:16 (libxul.so+0xfb0923)
[task 2022-02-06T21:52:54.589Z] 21:52:54     INFO - GECKO(1971) |     #8 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:467:10 (libxul.so+0xfb7342)
[task 2022-02-06T21:52:54.590Z] 21:52:54     INFO - GECKO(1971) |     #9 mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:300:20 (libxul.so+0x192de7e)
[task 2022-02-06T21:52:54.591Z] 21:52:54     INFO - GECKO(1971) |     #10 RunInternal /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:331:10 (libxul.so+0x18aafec)
[task 2022-02-06T21:52:54.592Z] 21:52:54     INFO - GECKO(1971) |     #11 RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:324:3 (libxul.so+0x18aafec)
[task 2022-02-06T21:52:54.593Z] 21:52:54     INFO - GECKO(1971) |     #12 MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:306:3 (libxul.so+0x18aafec)
[task 2022-02-06T21:52:54.594Z] 21:52:54     INFO - GECKO(1971) |     #13 nsThread::ThreadFunc(void*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:391:10 (libxul.so+0xfaca03)
[task 2022-02-06T21:52:54.595Z] 21:52:54     INFO - GECKO(1971) |     #14 _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5 (libnspr4.so+0x44fad)
[task 2022-02-06T21:52:54.596Z] 21:52:54     INFO - GECKO(1971) |     #15 __tsan_thread_start_func /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:967:15 (firefox+0x590cc)
[task 2022-02-06T21:52:54.596Z] 21:52:54     INFO - GECKO(1971) |     #16 start_thread /tmp/glibc/nptl/pthread_create.c:463 (libpthread.so.0+0x76da)
[task 2022-02-06T21:52:54.597Z] 21:52:54     INFO - GECKO(1971) |     #17 __clone /tmp/glibc/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (libc.so.6+0x121a3e)
[task 2022-02-06T21:52:54.598Z] 21:52:54     INFO - GECKO(1971) | ThreadSanitizer can not provide additional info.
[task 2022-02-06T21:52:54.599Z] 21:52:54     INFO - GECKO(1971) | SUMMARY: ThreadSanitizer: SEGV /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1237:7 in mozilla::net::nsSocketTransport::InitiateSocket()
[task 2022-02-06T21:52:54.599Z] 21:52:54     INFO - GECKO(1971) | ==1971==ABORTING
[task 2022-02-06T21:52:54.712Z] 21:52:54     INFO - GECKO(1971) | Exiting due to channel error.
[task 2022-02-06T21:52:54.713Z] 21:52:54     INFO - GECKO(1971) | Exiting due to channel error.
[task 2022-02-06T21:52:54.713Z] 21:52:54     INFO - GECKO(1971) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=50.0188) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=116.361) Exiting due to channel error.
[task 2022-02-06T21:52:54.713Z] 21:52:54     INFO - GECKO(1971) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=81.0901) Exiting due to channel error.
[task 2022-02-06T21:52:54.713Z] 21:52:54     INFO - GECKO(1971) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=50.1687) Exiting due to channel error.
[task 2022-02-06T21:52:54.714Z] 21:52:54     INFO - GECKO(1971) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=17.887) Exiting due to channel error.
[task 2022-02-06T21:52:54.718Z] 21:52:54     INFO - GECKO(1971) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=78.8936) Exiting due to channel error.
[task 2022-02-06T21:52:54.719Z] 21:52:54     INFO - GECKO(1971) | Exiting due to channel error.
[task 2022-02-06T21:52:54.720Z] 21:52:54     INFO - GECKO(1971) | Exiting due to channel error.
[task 2022-02-06T21:52:54.722Z] 21:52:54     INFO - GECKO(1971) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=264.194) Exiting due to channel error.
[task 2022-02-06T21:52:55.805Z] 21:52:55     INFO - TEST-INFO | Main app process: killed by SIGIOT
[task 2022-02-06T21:52:55.806Z] 21:52:55     INFO - Buffered messages logged at 21:52:50
[task 2022-02-06T21:52:55.806Z] 21:52:55     INFO - Entering test bound topsites_edit
[task 2022-02-06T21:52:55.807Z] 21:52:55     INFO - Buffered messages logged at 21:52:53
[task 2022-02-06T21:52:55.808Z] 21:52:55     INFO - TEST-PASS | browser/components/newtab/test/browser/browser_topsites_section.js | Should find a visible topsite form - true == true - 
[task 2022-02-06T21:52:55.808Z] 21:52:55     INFO - TEST-PASS | browser/components/newtab/test/browser/browser_topsites_section.js | Should find a visible overlay - true == true - 
[task 2022-02-06T21:52:55.809Z] 21:52:55     INFO - Leaving test bound topsites_edit
[task 2022-02-06T21:52:55.809Z] 21:52:55     INFO - Entering test bound topsites_pin_unpin
[task 2022-02-06T21:52:55.810Z] 21:52:55     INFO - Buffered messages finished
[task 2022-02-06T21:52:55.810Z] 21:52:55    ERROR - TEST-UNEXPECTED-FAIL | browser/components/newtab/test/browser/browser_topsites_section.js | application terminated with exit code -6
[task 2022-02-06T21:52:55.811Z] 21:52:55     INFO - runtests.py | Application ran for: 0:04:36.162114
[task 2022-02-06T21:52:55.811Z] 21:52:55     INFO - zombiecheck | Reading PID log: /tmp/tmpmt3s2zvmpidlog
[task 2022-02-06T21:52:55.812Z] 21:52:55     INFO - ==> process 1971 launched child process 1987
[task 2022-02-06T21:52:55.812Z] 21:52:55     INFO - ==> process 1971 launched child process 2046

[task 2022-02-06T21:52:57.417Z] 21:52:57     INFO - TEST-INFO | started process GECKO(2789)
[task 2022-02-06T21:53:02.275Z] 21:53:02     INFO - GECKO(2789) | 1644184382274	Marionette	INFO	Marionette enabled
[task 2022-02-06T21:53:03.241Z] 21:53:03     INFO - GECKO(2789) | 1644184383240	Marionette	TRACE	Received observer notification toplevel-window-ready
[task 2022-02-06T21:53:13.294Z] 21:53:13     INFO - GECKO(2789) | console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/tmpg8uxz81s.mozrunner/search.json.mozlz4", (void 0)))
[task 2022-02-06T21:53:13.830Z] 21:53:13     INFO - GECKO(2789) | JavaScript error: resource://gre/modules/UpdateService.jsm, line 1343: NS_ERROR_FILE_TARGET_DOES_NOT_EXIST: Component returned failure code: 0x80520006 (NS_ERROR_FILE_TARGET_DOES_NOT_EXIST) [nsIFile.directoryEntries]
[task 2022-02-06T21:53:26.176Z] 21:53:26     INFO - GECKO(2789) | 1644184406175	Marionette	TRACE	Received observer notification marionette-startup-requested
[task 2022-02-06T21:53:26.179Z] 21:53:26     INFO - GECKO(2789) | 1644184406177	Marionette	TRACE	Waiting until startup recorder finished recording startup scripts...
[task 2022-02-06T21:53:26.410Z] 21:53:26     INFO - GECKO(2789) | 1644184406409	Marionette	TRACE	All scripts recorded.
[task 2022-02-06T21:53:26.449Z] 21:53:26     INFO - GECKO(2789) | 1644184406448	Marionette	INFO	Listening on port 2828
[task 2022-02-06T21:53:26.453Z] 21:53:26     INFO - GECKO(2789) | 1644184406451	Marionette	DEBUG	Marionette is listening
[task 2022-02-06T21:53:26.698Z] 21:53:26     INFO - GECKO(2789) | 1644184406696	Marionette	DEBUG	Accepted connection 0 from 127.0.0.1:55006
[task 2022-02-06T21:53:26.723Z] 21:53:26     INFO - GECKO(2789) | 1644184406722	Marionette	DEBUG	Closed connection 0
[task 2022-02-06T21:53:26.726Z] 21:53:26     INFO - GECKO(2789) | 1644184406724	Marionette	DEBUG	Accepted connection 1 from 127.0.0.1:55008
[task 2022-02-06T21:53:26.809Z] 21:53:26     INFO - GECKO(2789) | 1644184406808	Marionette	DEBUG	1 -> [0,1,"WebDriver:NewSession",{"strictFileInteractability":true}]
[task 2022-02-06T21:53:26.938Z] 21:53:26     INFO - GECKO(2789) | 1644184406936	RemoteAgent	TRACE	[19] Document already finished loading: https://example.com/
[task 2022-02-06T21:53:26.947Z] 21:53:26     INFO - GECKO(2789) | 1644184406945	Marionette	DEBUG	1 <- [1,1,null,{"sessionId":"2bc6841d-fd69-4fbc-b6a1-ac4ca68e9f64","capabilities":{"browserName":"firefox","browserVersion":"98.0a ... .mozrunner","moz:shutdownTimeout":360000,"moz:useNonSpecCompliantPointerOrigin":false,"moz:webdriverClick":true,"proxy":{}}}]
[task 2022-02-06T21:53:26.977Z] 21:53:26     INFO - GECKO(2789) | 1644184406976	Marionette	DEBUG	1 -> [0,2,"Addon:Install",{"path":"/tmp/tmpfth62lur.zip","temporary":false}]
[task 2022-02-06T21:53:27.450Z] 21:53:27     INFO - GECKO(2789) | 1644184407449	Marionette	DEBUG	1 <- [1,2,null,{"value":"special-powers@mozilla.org"}]
[task 2022-02-06T21:53:27.497Z] 21:53:27     INFO - GECKO(2789) | 1644184407496	Marionette	DEBUG	1 -> [0,3,"Addon:Install",{"path":"/tmp/tmpeh_gd04r.zip","temporary":false}]
[task 2022-02-06T21:53:27.718Z] 21:53:27     INFO - GECKO(2789) | 1644184407717	Marionette	DEBUG	1 <- [1,3,null,{"value":"mochikit@mozilla.org"}]
[task 2022-02-06T21:53:27.752Z] 21:53:27     INFO - GECKO(2789) | 1644184407751	Marionette	DEBUG	1 -> [0,4,"Marionette:GetContext",{}]
[task 2022-02-06T21:53:27.754Z] 21:53:27     INFO - GECKO(2789) | 1644184407753	Marionette	DEBUG	1 <- [1,4,null,{"value":"content"}]
[task 2022-02-06T21:53:27.760Z] 21:53:27     INFO - GECKO(2789) | 1644184407759	Marionette	DEBUG	1 -> [0,5,"Marionette:SetContext",{"value":"chrome"}]
[task 2022-02-06T21:53:27.763Z] 21:53:27     INFO - GECKO(2789) | 1644184407761	Marionette	DEBUG	1 <- [1,5,null,{"value":null}]
[task 2022-02-06T21:53:27.772Z] 21:53:27     INFO - GECKO(2789) | 1644184407770	Marionette	DEBUG	1 -> [0,6,"WebDriver:ExecuteScript",{"script":"/* This Source Code Form is subject to the terms of the Mozilla Public\n * License, ... hrome","testUrl":"about:blank"}],"newSandbox":true,"sandbox":"default","line":1945,"filename":"tests/mochitest/runtests.py"}]
[task 2022-02-06T21:53:27.818Z] 21:53:27     INFO - GECKO(2789) | 1644184407817	Marionette	TRACE	[7] MarionetteCommands actor created for window id 2
[task 2022-02-06T21:53:28.075Z] 21:53:28     INFO - GECKO(2789) | 1644184408074	Marionette	TRACE	Received observer notification domwindowopened
[task 2022-02-06T21:53:28.129Z] 21:53:28     INFO - GECKO(2789) | 1644184408127	Marionette	TRACE	Remoteness change detected. Set new top-level browsing context to 51
[task 2022-02-06T21:53:28.197Z] 21:53:28     INFO - GECKO(2789) | 1644184408195	Marionette	DEBUG	1 <- [1,6,null,{"value":null}]
[task 2022-02-06T21:53:28.236Z] 21:53:28     INFO - GECKO(2789) | 1644184408235	Marionette	DEBUG	1 -> [0,7,"Marionette:SetContext",{"value":"content"}]
[task 2022-02-06T21:53:28.239Z] 21:53:28     INFO - GECKO(2789) | 1644184408237	Marionette	DEBUG	1 <- [1,7,null,{"value":null}]
[task 2022-02-06T21:53:28.471Z] 21:53:28     INFO - GECKO(2789) | 1644184408470	Marionette	DEBUG	1 -> [0,8,"WebDriver:DeleteSession",{}]
[task 2022-02-06T21:53:28.493Z] 21:53:28     INFO - GECKO(2789) | 1644184408492	Marionette	DEBUG	1 <- [1,8,null,{"value":null}]
[task 2022-02-06T21:53:28.612Z] 21:53:28     INFO - runtests.py | Waiting for browser...
[task 2022-02-06T21:53:28.652Z] 21:53:28     INFO - GECKO(2789) | 1644184408651	Marionette	DEBUG	Closed connection 1

Dragana, can you please take a look?

Flags: needinfo?(dd.mozilla)
Summary: Intermittent FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (52.85.132.97) was made.after xpcshell return code: 1 → Intermittent FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (52.85.132.97) was made.
Whiteboard: [stockwell needswork:owner]

Recent failures have been observed while browser/components/newtab/test/browser/browser_topsites_section.js got ran.

Component: DOM: Networking → New Tab Page
Flags: needinfo?(dd.mozilla)
Product: Core → Firefox
Assignee: nobody → apavel
Assignee: apavel → nobody
Keywords: leave-open
Attachment #9264919 - Attachment description: Bug 1714615 - disable browser_topsites_section.js on linux opt r?#intermittent-reviewers → Bug 1714615 - disable browser_topsites_section.js on linux tsan r?#intermittent-reviewers
Whiteboard: [stockwell needswork:owner] → [stockwell needswork:owner][stockwell disabled]
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8e3ce1bfafee
disable browser_topsites_section.js on linux tsan r=intermittent-reviewers,jmaher

Update:
There have been 32 failures within the last 7 days, all on Linux 18.04 x64 WebRender tsan opt.
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=370028133&repo=mozilla-central&lineNumber=4105

[task 2022-03-04T22:08:56.232Z] 22:08:56     INFO - TEST-START | browser/components/newtab/test/browser/browser_topsites_contextMenu_options.js
[task 2022-03-04T22:09:02.106Z] 22:09:02     INFO - GECKO(4543) | FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (18.67.65.99) was made.
[task 2022-03-04T22:09:02.107Z] 22:09:02     INFO - GECKO(4543) | You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
[task 2022-03-04T22:09:02.107Z] 22:09:02     INFO - GECKO(4543) | ThreadSanitizer:DEADLYSIGNAL
[task 2022-03-04T22:09:02.108Z] 22:09:02     INFO - GECKO(4543) | ==4543==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7df82d9974 bp 0x7b5000124c00 sp 0x7f7d6de187b0 T4555)
[task 2022-03-04T22:09:02.108Z] 22:09:02     INFO - GECKO(4543) | ==4543==The signal is caused by a WRITE memory access.
[task 2022-03-04T22:09:02.109Z] 22:09:02     INFO - GECKO(4543) | ==4543==Hint: address points to the zero page.
[task 2022-03-04T22:09:02.154Z] 22:09:02     INFO - GECKO(4543) |     #0 mozilla::net::nsSocketTransport::InitiateSocket() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1237:7 (libxul.so+0x11a5974)
[task 2022-03-04T22:09:02.154Z] 22:09:02     INFO - GECKO(4543) |     #1 mozilla::net::nsSocketTransport::OnSocketEvent(unsigned int, nsresult, nsISupports*) /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp (libxul.so+0x11a8e19)
[task 2022-03-04T22:09:02.156Z] 22:09:02     INFO - GECKO(4543) |     #2 mozilla::net::nsSocketEvent::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:94:17 (libxul.so+0x11d42b6)
[task 2022-03-04T22:09:02.156Z] 22:09:02     INFO - GECKO(4543) |     #3 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1167:16 (libxul.so+0xfb7663)
[task 2022-03-04T22:09:02.157Z] 22:09:02     INFO - GECKO(4543) |     #4 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:467:10 (libxul.so+0xfbe492)
[task 2022-03-04T22:09:02.158Z] 22:09:02     INFO - GECKO(4543) |     #5 mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp:1190:11 (libxul.so+0x11b4747)
[task 2022-03-04T22:09:02.159Z] 22:09:02     INFO - GECKO(4543) |     #6 non-virtual thunk to mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp (libxul.so+0x11b5ba9)
[task 2022-03-04T22:09:02.160Z] 22:09:02     INFO - GECKO(4543) |     #7 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1167:16 (libxul.so+0xfb7663)
[task 2022-03-04T22:09:02.161Z] 22:09:02     INFO - GECKO(4543) |     #8 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:467:10 (libxul.so+0xfbe492)
[task 2022-03-04T22:09:02.161Z] 22:09:02     INFO - GECKO(4543) |     #9 mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:300:20 (libxul.so+0x1a07eee)
[task 2022-03-04T22:09:02.162Z] 22:09:02     INFO - GECKO(4543) |     #10 RunInternal /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:331:10 (libxul.so+0x19833ac)
[task 2022-03-04T22:09:02.163Z] 22:09:02     INFO - GECKO(4543) |     #11 RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:324:3 (libxul.so+0x19833ac)
[task 2022-03-04T22:09:02.163Z] 22:09:02     INFO - GECKO(4543) |     #12 MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:306:3 (libxul.so+0x19833ac)
[task 2022-03-04T22:09:02.164Z] 22:09:02     INFO - GECKO(4543) |     #13 nsThread::ThreadFunc(void*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:389:10 (libxul.so+0xfb330c)
[task 2022-03-04T22:09:02.165Z] 22:09:02     INFO - GECKO(4543) |     #14 _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5 (libnspr4.so+0x44fad)
[task 2022-03-04T22:09:02.165Z] 22:09:02     INFO - GECKO(4543) |     #15 __tsan_thread_start_func /builds/worker/fetches/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:967:15 (firefox+0x5a0cc)
[task 2022-03-04T22:09:02.166Z] 22:09:02     INFO - GECKO(4543) |     #16 start_thread /tmp/glibc/nptl/pthread_create.c:463 (libpthread.so.0+0x76da)
[task 2022-03-04T22:09:02.167Z] 22:09:02     INFO - GECKO(4543) |     #17 __clone /tmp/glibc/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (libc.so.6+0x121a3e)
[task 2022-03-04T22:09:02.167Z] 22:09:02     INFO - GECKO(4543) | ThreadSanitizer can not provide additional info.
[task 2022-03-04T22:09:02.168Z] 22:09:02     INFO - GECKO(4543) | SUMMARY: ThreadSanitizer: SEGV /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1237:7 in mozilla::net::nsSocketTransport::InitiateSocket()
[task 2022-03-04T22:09:02.168Z] 22:09:02     INFO - GECKO(4543) | ==4543==ABORTING
[task 2022-03-04T22:09:02.283Z] 22:09:02     INFO - GECKO(4543) | Exiting due to channel error.
[task 2022-03-04T22:09:02.286Z] 22:09:02     INFO - GECKO(4543) | Exiting due to channel error.
[task 2022-03-04T22:09:02.289Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=10.1079) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=41.2605) Exiting due to channel error.
[task 2022-03-04T22:09:02.289Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=41.3484) Exiting due to channel error.
[task 2022-03-04T22:09:02.290Z] 22:09:02     INFO - GECKO(4543) | Exiting due to channel error.
[task 2022-03-04T22:09:02.291Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=99.1872) Exiting due to channel error.
[task 2022-03-04T22:09:02.292Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=66.0716) Exiting due to channel error.
[task 2022-03-04T22:09:02.293Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=257.435) Exiting due to channel error.
[task 2022-03-04T22:09:02.294Z] 22:09:02     INFO - GECKO(4543) | Exiting due to channel error.
[task 2022-03-04T22:09:02.295Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=266.139) Exiting due to channel error.
[task 2022-03-04T22:09:02.695Z] 22:09:02     INFO - GECKO(4543) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=265.322)
[task 2022-03-04T22:09:02.695Z] 22:09:02     INFO - GECKO(4543) | ###!!! [Child][PBackgroundChild] Error: Send(msgname=PBackground::Msg_PServiceWorkerContainerConstructor) Channel error: cannot send/recv
[task 2022-03-04T22:09:02.696Z] 22:09:02     INFO - GECKO(4543) | ###!!! [Child][PBackgroundChild] Error: Send(msgname=PClientSource::Msg_NoteDOMContentLoaded) Channel error: cannot send/recv
[task 2022-03-04T22:09:03.381Z] 22:09:03     INFO - TEST-INFO | Main app process: killed by SIGIOT
[task 2022-03-04T22:09:03.382Z] 22:09:03     INFO - Buffered messages logged at 22:08:56
[task 2022-03-04T22:09:03.383Z] 22:09:03     INFO - Entering test bound defaultTopSites_menuOptions
[task 2022-03-04T22:09:03.383Z] 22:09:03     INFO - Buffered messages logged at 22:08:58
[task 2022-03-04T22:09:03.383Z] 22:09:03     INFO - TEST-PASS | browser/components/newtab/test/browser/browser_topsites_contextMenu_options.js | Number of options is correct - 5 == 5 - 

Hi Dale! Can you please take a look at this? Maybe you can help us assign this to someone.
Thank you!

Flags: needinfo?(dharvey)

There have been 45 total failures in the last 7 days, recent failure log.
Affected platforms are:

  • linux1804-64-tsan-qr

I went a short way down the rabbit-hole to see what's going on here, after running into this on my own Try run.

It looks like there was a recent regression here (around Feb 1st), which made this intermittent failure go from ~never to pretty-frequent:
https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2022-01-09&endday=2022-03-16&tree=trunk&bug=1714615

So, something worth looking into while that's in recent memory (before we end up just disabling a bunch of tests, as it seems we've started doing).

Looking at the test logs (with an eye towards "remote settings" which I believe is one name we have for this server, I noticed that the following log-spam seems to appear shortly before the crash, which seems to indicate that our "RSLoader" update timer seems to have unexpectedly (?) kicked off:

INFO - GECKO(4553) | console.debug: RSLoader:
INFO - GECKO(4553) |   Registered update timer
INFO - GECKO(4553) | console.debug: RSLoader:
INFO - GECKO(4553) |   Updating recipes
INFO - GECKO(4553) | console.debug: RSLoader:
INFO - GECKO(4553) |   Got 1 recipes from Remote Settings
INFO - GECKO(4553) | console.debug: RSLoader:
INFO - GECKO(4553) |   No targeting for recipe, so it matches automatically
INFO - GECKO(4553) | console.debug: RSLoader:
INFO - GECKO(4553) |   [experiment] xman_test_message matched
INFO - GECKO(4553) | console.debug: ExperimentManager:
INFO - GECKO(4553) |   New experiment started: test_xman_cfr_1647379167641, control
INFO - GECKO(4553) | console.debug: RSLoader:
INFO - GECKO(4553) |   1 recipes matched. Finalizing ExperimentManager.
INFO - GECKO(4553) | console.debug: ExperimentManager:
INFO - GECKO(4553) |   Recipe unenrolled: test_xman_cfr_1647379167641

https://treeherder.mozilla.org/logviewer?job_id=371166379&repo=autoland&lineNumber=4185

It looks like k88hudson added a mechanism that was intended to turn this off for tests in bug 1709360. Maybe she would be a good person to take a look here, to diagnose why that's (I think?) not working? (e.g. maybe groups of tests need to opt-in to that mechanism and we have some new tests that need to be opted in?)

Side note, I assume this is TSAN-specific due to TSAN builds just being a bit slower, and the fact that this behavior is apparently triggered off of a timer (and the testrun takes long enough for the timer to fire in slower TSAN builds, vs. other builds probably get lucky enough to complete the tests before the timer fires).

Flags: needinfo?(khudson)

Thanks Daniel, hopefully Kate can take a look, clearing needinfo I cant really assign work to people

Flags: needinfo?(dharvey)

There are 73 total failures in the last 7 days on linux1804-64-tsan-qr opt
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=373198596&repo=mozilla-central&lineNumber=4274

[task 2022-04-02T22:03:06.018Z] 22:03:06     INFO - TEST-START | browser/components/newtab/test/browser/browser_newtab_overrides.js
[task 2022-04-02T22:03:17.357Z] 22:03:17     INFO - GECKO(2398) | FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (18.67.65.99) was made.
[task 2022-04-02T22:03:17.357Z] 22:03:17     INFO - GECKO(2398) | You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
[task 2022-04-02T22:03:17.357Z] 22:03:17     INFO - GECKO(2398) | ThreadSanitizer:DEADLYSIGNAL
[task 2022-04-02T22:03:17.357Z] 22:03:17     INFO - GECKO(2398) | ==2398==ERROR: ThreadSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f98dc2673a4 bp 0x7b5000124400 sp 0x7f9851d847b0 T2410)
[task 2022-04-02T22:03:17.357Z] 22:03:17     INFO - GECKO(2398) | ==2398==The signal is caused by a WRITE memory access.
[task 2022-04-02T22:03:17.357Z] 22:03:17     INFO - GECKO(2398) | ==2398==Hint: address points to the zero page.
[task 2022-04-02T22:03:17.427Z] 22:03:17     INFO - GECKO(2398) |     #0 mozilla::net::nsSocketTransport::InitiateSocket() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1237:7 (libxul.so+0x11b13a4)
[task 2022-04-02T22:03:17.427Z] 22:03:17     INFO - GECKO(2398) |     #1 mozilla::net::nsSocketTransport::OnSocketEvent(unsigned int, nsresult, nsISupports*) /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp (libxul.so+0x11b4849)
[task 2022-04-02T22:03:17.428Z] 22:03:17     INFO - GECKO(2398) |     #2 mozilla::net::nsSocketEvent::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:94:17 (libxul.so+0x11dff66)
[task 2022-04-02T22:03:17.429Z] 22:03:17     INFO - GECKO(2398) |     #3 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1181:16 (libxul.so+0xfc2153)
[task 2022-04-02T22:03:17.429Z] 22:03:17     INFO - GECKO(2398) |     #4 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:465:10 (libxul.so+0xfc82a5)
[task 2022-04-02T22:03:17.430Z] 22:03:17     INFO - GECKO(2398) |     #5 mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp:1202:11 (libxul.so+0x11c03f7)
[task 2022-04-02T22:03:17.431Z] 22:03:17     INFO - GECKO(2398) |     #6 non-virtual thunk to mozilla::net::nsSocketTransportService::Run() /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransportService2.cpp (libxul.so+0x11c1859)
[task 2022-04-02T22:03:17.432Z] 22:03:17     INFO - GECKO(2398) |     #7 nsThread::ProcessNextEvent(bool, bool*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:1181:16 (libxul.so+0xfc2153)
[task 2022-04-02T22:03:17.433Z] 22:03:17     INFO - GECKO(2398) |     #8 NS_ProcessNextEvent(nsIThread*, bool) /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:465:10 (libxul.so+0xfc82a5)
[task 2022-04-02T22:03:17.435Z] 22:03:17     INFO - GECKO(2398) |     #9 mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp:300:20 (libxul.so+0x1a170ae)
[task 2022-04-02T22:03:17.435Z] 22:03:17     INFO - GECKO(2398) |     #10 RunInternal /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:380:10 (libxul.so+0x19915bc)
[task 2022-04-02T22:03:17.435Z] 22:03:17     INFO - GECKO(2398) |     #11 RunHandler /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:373:3 (libxul.so+0x19915bc)
[task 2022-04-02T22:03:17.435Z] 22:03:17     INFO - GECKO(2398) |     #12 MessageLoop::Run() /builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc:355:3 (libxul.so+0x19915bc)
[task 2022-04-02T22:03:17.436Z] 22:03:17     INFO - GECKO(2398) |     #13 nsThread::ThreadFunc(void*) /builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp:385:10 (libxul.so+0xfbdccc)
[task 2022-04-02T22:03:17.436Z] 22:03:17     INFO - GECKO(2398) |     #14 _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5 (libnspr4.so+0x44fad)
[task 2022-04-02T22:03:17.437Z] 22:03:17     INFO - GECKO(2398) |     #15 __tsan_thread_start_func /builds/worker/fetches/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:967:15 (firefox+0x5a0cc)
[task 2022-04-02T22:03:17.437Z] 22:03:17     INFO - GECKO(2398) |     #16 start_thread /tmp/glibc/nptl/pthread_create.c:463 (libpthread.so.0+0x76da)
[task 2022-04-02T22:03:17.439Z] 22:03:17     INFO - GECKO(2398) |     #17 __clone /tmp/glibc/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (libc.so.6+0x121a3e)
[task 2022-04-02T22:03:17.439Z] 22:03:17     INFO - GECKO(2398) | ThreadSanitizer can not provide additional info.
[task 2022-04-02T22:03:17.439Z] 22:03:17     INFO - GECKO(2398) | SUMMARY: ThreadSanitizer: SEGV /builds/worker/checkouts/gecko/netwerk/base/nsSocketTransport2.cpp:1237:7 in mozilla::net::nsSocketTransport::InitiateSocket()
[task 2022-04-02T22:03:17.440Z] 22:03:17     INFO - GECKO(2398) | ==2398==ABORTING
[task 2022-04-02T22:03:17.542Z] 22:03:17     INFO - GECKO(2398) | Exiting due to channel error.
[task 2022-04-02T22:03:17.548Z] 22:03:17     INFO - GECKO(2398) | Exiting due to channel error.
[task 2022-04-02T22:03:17.548Z] 22:03:17     INFO - GECKO(2398) | Exiting due to channel error.
[task 2022-04-02T22:03:17.549Z] 22:03:17     INFO - GECKO(2398) | Exiting due to channel error.
[task 2022-04-02T22:03:17.551Z] 22:03:17     INFO - GECKO(2398) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=256.047) Exiting due to channel error.
[task 2022-04-02T22:03:17.552Z] 22:03:17     INFO - GECKO(2398) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=27.7372) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=84.966) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=52.6732) Exiting due to channel error.
[task 2022-04-02T22:03:17.552Z] 22:03:17     INFO - GECKO(2398) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=27.8728) Exiting due to channel error.
[task 2022-04-02T22:03:17.552Z] 22:03:17     INFO - GECKO(2398) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=52.7242) Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=256.082) Exiting due to channel error.
[task 2022-04-02T22:03:17.552Z] 22:03:17     INFO - GECKO(2398) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=265.299) Exiting due to channel error.
[task 2022-04-02T22:03:17.552Z] 22:03:17     INFO - GECKO(2398) | Crash Annotation GraphicsCriticalError: |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=266.218) Exiting due to channel error.
[task 2022-04-02T22:03:18.617Z] 22:03:18     INFO - TEST-INFO | Main app process: killed by SIGIOT

Kate, did you get a chance to take a look at this?

Flags: needinfo?(khudson)
Flags: needinfo?(khudson)
Whiteboard: [stockwell disabled][stockwell disable-recommended] → [stockwell disabled][stockwell needswork:owner]
Flags: needinfo?(khudson)

There have been 32 failures in the last 7 days.

Happens on:

  • linux1804-64-tsan-qr opt
  • windows10-32-2004-qr debug and opt

Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=373978394&repo=autoland&lineNumber=4164

Kate, could you look into this?

Flags: needinfo?(khudson)

There have been 41 total failures in the last 7 days, recent failure log.
Affected platforms are:

  • linux1804-64-tsan-qr

(In reply to Daniel Holbert [:dholbert] from comment #39)

[...] log-spam seems to appear shortly before the crash, which seems to indicate that our "RSLoader" update timer seems to have unexpectedly (?) kicked off

Given this diagnosis (and more from comment 39), I suspect this bug really belongs in the Remote Settings Client component. Hopefully/maybe it may get more traction from folks who can help over there. --> reclassifying.

Component: New Tab Page → Remote Settings Client

(Looks like mathieu is active in that component as well; mathieu, maybe you could take a look here? See comment 39; it looks like remote settings is making network connections during test runs, after some delay, and those network connections cause trouble & cause insta-death because CI test runs aren't supposed to touch the network. It looks like bug 1709360 was possibly meant to prevent that, but that seems not to be taking effect here.)

Flags: needinfo?(mathieu)

This is failing quite frequently.
All the retriggers here failed but in this backfill range there are some green jobs.
Hi Daniel! Can you please take another look at this?
Thank you!

Flags: needinfo?(dholbert)
Summary: Intermittent FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (52.85.132.97) was made. → High frequency FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (52.85.132.97) was made.
Whiteboard: [stockwell disabled][stockwell unknown] → [stockwell disabled][retriggered]

Hi! See my comments above -- I did take a look already. Someone who knows our remote-settings code needs to take it from here.

Flags: needinfo?(dholbert)
Flags: needinfo?(mathieu)
Flags: needinfo?(khudson)

Mathieu, could you please take a look or find someone who can take a look? See my diagnosis in comment 39 and following comments.

This has been causing quite a bit of intermittent-failure trouble, and has been for some time, and really needs to be addressed. And it seems like bug 1709360 was meant to prevent this ("We should turn off automatic syncing of Nimbus experiments in general for tests") but it isn't working for some reason.

(Without being disabled, it looks like remote settings insists on making a network connection after a timeout elapses, and network connections intentionally are caught & flagged as badness/test-failures in our test runs, which is why this is causing trouble here.)

Flags: needinfo?(mathieu)

--> marking this as blocking bug 1709360, since there seems to be a theoretical connection at least and since bug 1709360 seems to not-actually-be-fixed as demonstrated by the existence of this bug here.

Blocks: 1709360

Looks like this became frequent with the changes in Bug 1776609, retriggers here.
Daisuke, any idea why those changes would impact this kind of failures?

Failure log: https://treeherder.mozilla.org/logviewer?job_id=386092521&repo=autoland&lineNumber=4333

Flags: needinfo?(mathieu) → needinfo?(daisuke)

leaving the initial needinfo active

Flags: needinfo?(mathieu)

We can set an invalid URL for the remote settings server URL, and this should disable connections.
The preference will be read in tests if MOZ_DISABLE_NONLOCAL_CONNECTIONS is set.

user_pref("services.settings.server", "invalid://err");

Would this be the file? https://searchfox.org/mozilla-central/source/js/src/tests/user.js

For the RS test suites, we only use xpcshell tests, and we set the server URL preference to a local server, so it should not break them.

Flags: needinfo?(mathieu)

(In reply to Cosmin Sabou [:CosminS] from comment #70)

Daisuke, any idea why those changes would impact this kind of failures?

This kind of failure is just a "time-bomb" from a timer expiring. So: any patch that increases the duration of this overall testrun (e.g. by adding new tests, or adding a test such that existing tests get rebucketed in a way that happens to increase a task's overall duration) would make this more-frequent. Without having dug too deeply, this is the most-likely explanation for why daisuke's patch would've increased the frequency here.

(In reply to Mathieu Leplatre [:leplatrem] from comment #72)

We can set an invalid URL for the remote settings server URL, and this should disable connections.

Nice, that sounds like a good approach.

Would this be the file? https://searchfox.org/mozilla-central/source/js/src/tests/user.js

I'm not familiar with that file, but I think it's likely specific to the JS-engine tests that live in that subtree.

You probably want this file -- it's used for mochitests, which is the category of test that's having trouble here:
https://searchfox.org/mozilla-central/source/testing/profiles/common/user.js

Flags: needinfo?(daisuke)

(note: when we fix this, we should be sure to also revert the test-disabling that was landed earlier here in https://hg.mozilla.org/integration/autoland/rev/8e3ce1bfafee )

Assignee: nobody → mathieu

You probably want this file -- it's used for mochitests, which is the category of test that's having trouble here:
https://searchfox.org/mozilla-central/source/testing/profiles/common/user.js

Actually, we already set the server URL in this file:
https://searchfox.org/mozilla-central/rev/560bbebd595cf2f6a9861f3f67dd0739f4c4c559/testing/profiles/common/user.js#62

And while debugging, I could verify that the server URL comes from:
https://searchfox.org/mozilla-central/rev/560bbebd595cf2f6a9861f3f67dd0739f4c4c559/testing/profiles/unittest-required/user.js#202

./mach test browser_topsites_section
...
...
0:02.46 GECKO(76832) console.info: "Utils.SERVER_URL" "http://127.0.0.1:8888/dummy-kinto/v1"`

I also ran ./mach test browser/components/newtab/test/browser/ and grep the logs. No test had a SERVER_URL pointing to https://firefox.settings.services.mozilla.com

Could there be something specific about tsan tests/builds? Like a different profile template or something? (server url is set in various profiles already, including common/user.js)

Flags: needinfo?(dholbert)

AFAIK, the only special thing about TSAN tests/builds is that they're really slow.

Your notes in comment 76 indicate that this pref-value (services.settings.server) must not be the relevant thing that's giving us this value.

Theory #1 (unlikely): maybe there's a race condition with setting the pref value, and we somehow kick off a timeout with the "bad" server before we've managed to set the nerfed pref value?

Theory #2 (more likely): maybe some piece of Firefox isn't bothering to check the pref, and is instead just directly connecting to https://firefox.settings.services.mozilla.com. It looks like we do have a 10 hardcoded references to that server in our sources:
https://searchfox.org/mozilla-central/search?q=firefox.settings.services.mozilla.com&path=

Presumably one of those usages is getting through (and should be updated to check this pref and use it instead if specified)?

Flags: needinfo?(dholbert)

(expanding on theory #2: from a quick sampling, it looks like the failures here are in new-tab code. And we do have two new-tab-associated files with a hardcoded usage of this server URL:
https://searchfox.org/mozilla-central/rev/560bbebd595cf2f6a9861f3f67dd0739f4c4c559/browser/components/newtab/data/content/activity-stream.bundle.js#1506
https://searchfox.org/mozilla-central/rev/560bbebd595cf2f6a9861f3f67dd0739f4c4c559/browser/components/newtab/content-src/components/ASRouterAdmin/ASRouterAdmin.jsx#1119

Presumably that code should the pref-controlled server value preferentially, but it's not clear to me whether (or where) it does that right now.

I imagine one thing that would help here would be to know the actual/full URL that we're failing to access here (rather than just the domain).

We get the domain ("a connection attempt to firefox.settings.services.mozilla.com") from some logging that gets spit out before we crash, and the actual MOZ_CRASH_UNSAFE crash-macro that we're using here tries to print out the URL as part of the crash message string, but unfortunately this macro only prints its string in debug/fuzzing builds. (And these TSAN runs are non-debug/non-fuzzing opt builds.)

Maybe worth adding some ifdef NIGHTLY_BUILD && MOZ_TSAN-guarded logging just before the crash which prints out the full URL, and then we can check for that in the logs of the next failures after that point, and that might give us a clue about where the connection is coming from.

(The ifdeffing would just be for an abundance of caution, to avoid unnecessarily logging user-accessed URLs in actual builds that actual users are running.)

(In reply to Daniel Holbert [:dholbert] from comment #79)

Maybe worth adding some ifdef NIGHTLY_BUILD && MOZ_TSAN-guarded logging just before the crash which prints out the full URL

I did this on Try here:
https://treeherder.mozilla.org/jobs?repo=try&revision=b84009b88da58b08ed84b13bf503156e95a9fb3c

I'll spam some retriggers on that cycle, and hopefully will trigger this bug. Worst-case if the bug doesn't show up on that Try run's retriggers, we could land the patch (or something like it) and wait for additional failures on autoland/central.

Aha, my Try run got "lucky", no retriggers required!

Here's the log for the failure that includes a full URI:
https://treeherder.mozilla.org/logviewer?job_id=386307497&repo=try&lineNumber=4409

And here's the logging:

INFO - GECKO(2909) | Attempting to connect to non-local address! opener is [resource://services-settings/Utils.jsm:279:14], uri is [https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/changeset?_expected=0]

So looking for monitor/collections in our source, it looks like this is related to
https://searchfox.org/mozilla-central/rev/417eb46de41c31c3223d7b84c275ba56fc7d4343/services/settings/Utils.jsm#110

And presumably someone's calling fetchLatestChanges with serverURL corresponding to the actual live URL. Looking at calls to fetchLatestChanges, it looks like all of the calls pass in lazy.Utils.SERVER_URL, so that SERVER_URL function must be returning the actual live https://firefox.settings.services.mozilla.com URL at some point in time.

I added some logging to the SERVER_URL function, and got some logging during a failure here:
https://treeherder.mozilla.org/logviewer?job_id=386313801&repo=try&lineNumber=4506
https://firefoxci.taskcluster-artifacts.net/ExTDYv7WRuiZieoAfhsi_w/0/public/logs/live_backing.log

Look in particular at the logging for PID 2916 in that log (the "In SERVER_URL" lines are my added logging, via dump() statements inside that JS function):

[task 2022-08-03T20:37:43.045Z] 20:37:43     INFO - TEST-INFO | started process GECKO(2916)
[task 2022-08-03T20:37:46.105Z] 20:37:46     INFO - GECKO(2916) | 1659559066104	Marionette	INFO	Marionette enabled
[task 2022-08-03T20:37:46.116Z] 20:37:46     INFO - GECKO(2916) | 1659559066115	Marionette	TRACE	Received observer notification final-ui-startup
[task 2022-08-03T20:37:46.126Z] 20:37:46     INFO - GECKO(2916) | 1659559066124	Marionette	INFO	Listening on port 2828
[task 2022-08-03T20:37:46.127Z] 20:37:46     INFO - GECKO(2916) | 1659559066125	Marionette	DEBUG	Marionette is listening
[task 2022-08-03T20:37:46.902Z] 20:37:46     INFO - GECKO(2916) | 1659559066900	Marionette	DEBUG	Accepted connection 0 from 127.0.0.1:41700
[task 2022-08-03T20:37:47.143Z] 20:37:47     INFO - GECKO(2916) | 1659559067142	Marionette	DEBUG	Closed connection 0
[task 2022-08-03T20:37:47.152Z] 20:37:47     INFO - GECKO(2916) | 1659559067151	Marionette	DEBUG	Accepted connection 1 from 127.0.0.1:41702
[task 2022-08-03T20:37:47.355Z] 20:37:47     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:47.378Z] 20:37:47     INFO - GECKO(2916) | 1659559067377	Marionette	DEBUG	Closed connection 1
[task 2022-08-03T20:37:47.381Z] 20:37:47     INFO - GECKO(2916) | 1659559067379	Marionette	DEBUG	Accepted connection 2 from 127.0.0.1:41704
[task 2022-08-03T20:37:48.802Z] 20:37:48     INFO - GECKO(2916) | 1659559068800	Marionette	DEBUG	2 -> [0,1,"WebDriver:NewSession",{"strictFileInteractability":true}]
[task 2022-08-03T20:37:48.828Z] 20:37:48     INFO - GECKO(2916) | 1659559068826	Marionette	DEBUG	Waiting for initial application window
[task 2022-08-03T20:37:53.090Z] 20:37:53     INFO - GECKO(2916) | console.warn: SearchSettings: "get: No settings file exists, new profile?" (new NotFoundError("Could not open the file at /tmp/tmpn9l63ksl.mozrunner/search.json.mozlz4", (void 0)))
[task 2022-08-03T20:37:53.295Z] 20:37:53     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:53.298Z] 20:37:53     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:53.329Z] 20:37:53     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:53.389Z] 20:37:53     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:53.468Z] 20:37:53     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:54.680Z] 20:37:54     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:55.306Z] 20:37:55     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:58.519Z] 20:37:58     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[task 2022-08-03T20:37:59.514Z] 20:37:59     INFO - GECKO(2916) | 1659559079512	Marionette	TRACE	Received observer notification browser-idle-startup-tasks-finished
[task 2022-08-03T20:37:59.564Z] 20:37:59     INFO - GECKO(2916) | 1659559079562	RemoteAgent	TRACE	[24] Document already finished loading: about:blank
[...]
[task 2022-08-03T20:38:01.848Z] 20:38:01     INFO - *** Start BrowserChrome Test Results ***
[task 2022-08-03T20:38:02.100Z] 20:38:02     INFO - checking window state
[task 2022-08-03T20:38:02.277Z] 20:38:02     INFO - TEST-START | browser/components/newtab/test/browser/browser_aboutwelcome_configurable_ui.js
[task 2022-08-03T20:38:03.801Z] 20:38:03     INFO - GECKO(2916) | console.debug: AboutWelcomeChild:
[...]

[task 2022-08-03T20:38:04.831Z] 20:38:04     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: http://127.0.0.1:8888/dummy-kinto/v1
[...]

[task 2022-08-03T20:41:42.661Z] 20:41:42     INFO - TEST-OK | browser/components/newtab/test/browser/browser_remote_l10n.js | took 366ms
[task 2022-08-03T20:41:42.741Z] 20:41:42     INFO - checking window state
[task 2022-08-03T20:41:42.809Z] 20:41:42     INFO - TEST-START | browser/components/newtab/test/browser/browser_topsites_annotation.js
[task 2022-08-03T20:42:16.407Z] 20:42:16     INFO - GECKO(2916) | ###dholbert-logging: In SERVER_URL; lazy.allowServerURLOverride is true. Returning lazy.gServerURL: https://firefox.settings.services.mozilla.com/v1
[task 2022-08-03T20:42:16.486Z] 20:42:16     INFO - GECKO(2916) | FATAL ERROR: Non-local network connections are disabled and a connection attempt to firefox.settings.services.mozilla.com (18.160.225.118) was made.

That last SERVER_URL access (still in the same process ID) is getting a different lazy.gServerURL value for some reason (the problematic "real" value, https://firefox.settings.services.mozilla.com/v1).

One possible issue/explanation that I'm seeing... Some tests apparently use a utils file to set and then clear this services.settings.server pref, and they clear it using clearUserPref, which might be literally clearing the pref entirely (i.e. resetting the local-URL value that we set for this pref in a user.js file).

We probably want to be using pushPrefEnv rather than explicitly setting/clearing the pref.

Yeah, it looks like that test utils file (RemoteImagesTestUtils.jsm) and its clearUserPref invocation is the culprit here. CC'ing barrett who seems to have touched that line last, as an FYI for them.

I pushed a dummy try patch to replace the clearUserPref with setCharPref with the actual expected underlying value, and that seems to have made the issue go away on tsan bc5 test runs (18 successful runs, 2 infra exceptions, no instances of this issue):
https://treeherder.mozilla.org/jobs?repo=try&revision=5bc587ddcfeafa37bb7fc86649ce68e67168fea1&selectedTaskRun=VPweCOWPTbqBUeELOaIALw.0

(Previous try runs would hit this issue on essentially every single tsan bc5 test run. So 18 green is a pretty good signal of the issue going away.)

We don't want to literally do this setCharPref thing, of course -- we probably really want to do pushPrefEnv or something like it, though we don't seem to have SpecialPowers in this jsm file and I'm not sure what precisely the right invocation would be to get it.

I think SpecialPowers is only available in mochitest files, so regular JSMs cannot access it. I'll update the test utils to save the last value and restore it instead of clearing the pref.

Looks like that may've done it!

The most recent failure report was about an hour before the fix landed in bug 1783093. And we've now had 3 days of no reported failures since then.

So: I think we can call this fixed by bug 1783093.

Status: REOPENED → RESOLVED
Closed: 2 years ago1 year ago
Resolution: --- → FIXED

This test was disabled in an attempt to mitigate this bug. As it turns out, the
underlying issue was in a different test, and that test has now been fixed, so
we can safely re-enable this one.

Attachment #9288735 - Attachment description: WIP: Bug 1714615 - re-enable browser_topsites_section.js on linux tsan. (no review) → Bug 1714615 - re-enable browser_topsites_section.js on linux tsan. (no review)
Pushed by dholbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/be15905b1893
re-enable browser_topsites_section.js on linux tsan. (no review)

A patch has been attached on this bug, which was already closed. Filing a separate bug will ensure better tracking. If this was not by mistake and further action is needed, please alert the appropriate party. (Or: if the patch doesn't change behavior -- e.g. landing a test case, or fixing a typo -- then feel free to disregard this message)

Target Milestone: --- → 105 Branch
See Also: → 1833508
You need to log in before you can comment on or make changes to this bug.