Closed
Bug 1108587
Opened 10 years ago
Closed 7 years ago
Intermittent browser_alarms.js,test_webgl_request_mismatch.html | application crashed [@ mozilla::`anonymous namespace'::RunWatchdog(void *)]
Categories
(Toolkit :: Async Tooling, defect, P3)
Toolkit
Async Tooling
Tracking
()
RESOLVED
FIXED
mozilla58
People
(Reporter: cbook, Assigned: Yoric)
References
(Blocks 1 open bug, )
Details
(Keywords: crash, intermittent-failure, Whiteboard: [necko-backlog][stockwell fixed:other])
Crash Data
Attachments
(1 file)
Windows 7 32-bit mozilla-central opt test mochitest-browser-chrome-2
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=729651&repo=mozilla-central
05:08:22 WARNING - PROCESS-CRASH | chrome://mochitests/content/browser/hal/tests/browser_alarms.js | application crashed [@ mozilla::`anonymous namespace'::RunWatchdog(void *)]
5:08:22 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmppamfqt.mozrunner\minidumps\fae4ff76-b276-4974-baed-09e30a9c8aa8.dmp
05:08:22 INFO - Operating system: Windows NT
05:08:22 INFO - 6.1.7601 Service Pack 1
05:08:22 INFO - CPU: x86
05:08:22 INFO - GenuineIntel family 6 model 30 stepping 5
05:08:22 INFO - 8 CPUs
05:08:22 INFO - Crash reason: EXCEPTION_BREAKPOINT
05:08:22 INFO - Crash address: 0x62fe481b
05:08:22 INFO - Thread 42 (crashed)
05:08:22 INFO - 0 xul.dll!mozilla::`anonymous namespace'::RunWatchdog(void *) [nsTerminator.cpp:035a951fc24a : 151 + 0x0]
05:08:22 INFO - eip = 0x62fe481b esp = 0x1735ff18 ebp = 0x1735ff1c ebx = 0x002cf0d8
05:08:22 INFO - esi = 0x0000003f edi = 0x01d14d00 eax = 0x0000003f ecx = 0x63e70c80
05:08:22 INFO - edx = 0x0000001b efl = 0x00000246
05:08:22 INFO - Found by: given as instruction pointer in context
05:08:22 INFO - 1 nss3.dll!_PR_NativeRunThread [pruthr.c:035a951fc24a : 397 + 0x7]
05:08:22 INFO - eip = 0x675c6718 esp = 0x1735ff24 ebp = 0x1735ff38
05:08:22 INFO - Found by: call frame info
05:08:22 INFO - 2 nss3.dll!pr_root [w95thred.c:035a951fc24a : 90 + 0xb]
05:08:22 INFO - eip = 0x675b921f esp = 0x1735ff40 ebp = 0x1735ff44
05:08:22 INFO - Found by: call frame info
05:08:22 INFO - 3 msvcr120.dll + 0x2c01c
05:08:22 INFO - eip = 0x6764c01d esp = 0x1735ff4c ebp = 0x1735ff7c
05:08:22 INFO - Found by: call frame info
05:08:22 INFO - 4 msvcr120.dll + 0x2c000
05:08:22 INFO - eip = 0x6764c001 esp = 0x1735ff84 ebp = 0x1735ff88
05:08:22 INFO - Found by: previous frame's frame pointer
05:08:22 INFO - 5 kernel32.dll + 0x53c44
05:08:22 INFO - eip = 0x77453c45 esp = 0x1735ff90 ebp = 0x1735ff94
05:08:22 INFO - Found by: previous frame's frame pointer
05:08:22 INFO - 6 ntdll.dll + 0x637f4
05:08:22 INFO - eip = 0x779c37f5 esp = 0x1735ff9c ebp = 0x1735ffd4
05:08:22 INFO - Found by: previous frame's frame pointer
05:08:22 INFO - 7 ntdll.dll + 0x637c7
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
OS: Windows 7 → All
Hardware: x86 → All
Summary: Intermittent browser_alarms.js | application crashed [@ mozilla::`anonymous namespace'::RunWatchdog(void *)] → Intermittent browser_alarms.js,test_webgl_request_mismatch.html | application crashed [@ mozilla::`anonymous namespace'::RunWatchdog(void *)]
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 3•10 years ago
|
||
Maybe the ASAN stack sheds some light on this?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
Component: General → Networking: Cache
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•9 years ago
|
Whiteboard: [necko-backlog]
Comment 66•8 years ago
|
||
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•8 years ago
|
Depends on: RunWatchdogShutdownhang
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 79•8 years ago
|
||
this picked up, 50 failures in the last week, looks like June 5th/6th was a new rate. Primarily on linux32/opt (non-e10s)
from this log file:
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=105826337&lineNumber=3065
I see this in the log file:
[task 2017-06-09T13:41:43.056013Z] 13:41:43 INFO - TEST-START | browser/components/sessionstore/test/browser_windowStateContainer.js
[task 2017-06-09T13:41:52.232132Z] 13:41:52 INFO - GECKO(1703) | MEMORY STAT | vsize 1142MB | residentFast 377MB | heapAllocated 172MB
[task 2017-06-09T13:41:52.234193Z] 13:41:52 INFO - TEST-OK | browser/components/sessionstore/test/browser_windowStateContainer.js | took 9176ms
[task 2017-06-09T13:41:52.312716Z] 13:41:52 INFO - checking window state
[task 2017-06-09T13:41:59.565830Z] 13:41:59 INFO - GECKO(1703) | Completed ShutdownLeaks collections in process 1703
[task 2017-06-09T13:41:59.567569Z] 13:41:59 INFO - TEST-START | Shutdown
[task 2017-06-09T13:41:59.570925Z] 13:41:59 INFO - Browser Chrome Test Summary
[task 2017-06-09T13:41:59.572780Z] 13:41:59 INFO - Passed: 1164
[task 2017-06-09T13:41:59.574455Z] 13:41:59 INFO - Failed: 0
[task 2017-06-09T13:41:59.576153Z] 13:41:59 INFO - Todo: 0
[task 2017-06-09T13:41:59.577929Z] 13:41:59 INFO - Mode: non-e10s
[task 2017-06-09T13:41:59.583430Z] 13:41:59 INFO - *** End BrowserChrome Test Results ***
[task 2017-06-09T13:42:27.029268Z] 13:42:27 INFO - GECKO(1703) | [1703] WARNING: waitpid failed pid:1769 errno:10: file /home/worker/workspace/build/src/ipc/chromium/src/base/process_util_posix.cc, line 276
[task 2017-06-09T13:43:30.897461Z] 13:43:30 INFO - GECKO(1703) | ExceptionHandler::GenerateDump cloned child 1794
[task 2017-06-09T13:43:30.900046Z] 13:43:30 INFO - GECKO(1703) | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2017-06-09T13:43:30.903394Z] 13:43:30 INFO - GECKO(1703) | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2017-06-09T13:43:31.443716Z] 13:43:31 INFO - TEST-INFO | Main app process: exit 11
[task 2017-06-09T13:43:31.445760Z] 13:43:31 INFO - Buffered messages finished
[task 2017-06-09T13:43:31.447364Z] 13:43:31 ERROR - TEST-UNEXPECTED-FAIL | browser/components/sessionstore/test/browser_windowStateContainer.js | application terminated with exit code 11
[task 2017-06-09T13:43:31.450202Z] 13:43:31 INFO - runtests.py | Application ran for: 0:07:58.257090
[task 2017-06-09T13:43:31.454019Z] 13:43:31 INFO - zombiecheck | Reading PID log: /tmp/tmpvsQN3opidlog
[task 2017-06-09T13:43:31.455943Z] 13:43:31 INFO - ==> process 1703 launched child process 1722
[task 2017-06-09T13:43:31.458179Z] 13:43:31 INFO - ==> process 1703 launched child process 1769
[task 2017-06-09T13:43:31.461059Z] 13:43:31 INFO - zombiecheck | Checking for orphan process with PID: 1722
[task 2017-06-09T13:43:31.463053Z] 13:43:31 INFO - zombiecheck | Checking for orphan process with PID: 1769
[task 2017-06-09T13:43:31.465148Z] 13:43:31 INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/eD7DKr2pSzesOXbz2HeMXA/artifacts/public/build/target.crashreporter-symbols.zip
[task 2017-06-09T13:43:47.916554Z] 13:43:47 INFO - mozcrash Copy/paste: /usr/local/bin/linux64-minidump_stackwalk /tmp/tmpD_xLTd.mozrunner/minidumps/133ef873-6fa0-403d-463f-f713cdff9012.dmp /tmp/tmpaSKT7V
[task 2017-06-09T13:44:09.925072Z] 13:44:09 INFO - mozcrash Saved minidump as /home/worker/workspace/build/blobber_upload_dir/133ef873-6fa0-403d-463f-f713cdff9012.dmp
[task 2017-06-09T13:44:09.942381Z] 13:44:09 INFO - mozcrash Saved app info as /home/worker/workspace/build/blobber_upload_dir/133ef873-6fa0-403d-463f-f713cdff9012.extra
[task 2017-06-09T13:44:10.051103Z] 13:44:10 INFO - PROCESS-CRASH | browser/components/sessionstore/test/browser_windowStateContainer.js | application crashed [@ RunWatchdog]
[task 2017-06-09T13:44:10.053834Z] 13:44:10 INFO - Crash dump filename: /tmp/tmpD_xLTd.mozrunner/minidumps/133ef873-6fa0-403d-463f-f713cdff9012.dmp
[task 2017-06-09T13:44:10.058048Z] 13:44:10 INFO - Operating system: Linux
[task 2017-06-09T13:44:10.059866Z] 13:44:10 INFO - 0.0.0 Linux 3.13.0-100-generic #147-Ubuntu SMP Tue Oct 18 16:48:51 UTC 2016 x86_64
[task 2017-06-09T13:44:10.061438Z] 13:44:10 INFO - CPU: x86
[task 2017-06-09T13:44:10.063053Z] 13:44:10 INFO - GenuineIntel family 6 model 45 stepping 7
[task 2017-06-09T13:44:10.064814Z] 13:44:10 INFO - 1 CPU
[task 2017-06-09T13:44:10.066397Z] 13:44:10 INFO -
[task 2017-06-09T13:44:10.067941Z] 13:44:10 INFO - GPU: UNKNOWN
[task 2017-06-09T13:44:10.069592Z] 13:44:10 INFO -
[task 2017-06-09T13:44:10.072581Z] 13:44:10 INFO - Crash reason: SIGSEGV
[task 2017-06-09T13:44:10.074162Z] 13:44:10 INFO - Crash address: 0x0
[task 2017-06-09T13:44:10.075716Z] 13:44:10 INFO - Process uptime: not available
[task 2017-06-09T13:44:10.078003Z] 13:44:10 INFO -
[task 2017-06-09T13:44:10.079623Z] 13:44:10 INFO - Thread 14 (crashed)
[task 2017-06-09T13:44:10.082188Z] 13:44:10 INFO - 0 libxul.so!RunWatchdog [nsTerminator.cpp:0a3b78002ba3 : 160 + 0x2]
[task 2017-06-09T13:44:10.083905Z] 13:44:10 INFO - eip = 0xf38654fd esp = 0xc30fe2f0 ebp = 0xc30fe318 ebx = 0xf5615000
[task 2017-06-09T13:44:10.086096Z] 13:44:10 INFO - esi = 0x0000003f edi = 0xbef2f574 eax = 0x00000000 ecx = 0x00000000
[task 2017-06-09T13:44:10.088787Z] 13:44:10 INFO - edx = 0xf47a8408 efl = 0x00010246
[task 2017-06-09T13:44:10.090974Z] 13:44:10 INFO - Found by: given as instruction pointer in context
[task 2017-06-09T13:44:10.093279Z] 13:44:10 INFO - 1 libnspr4.so!_pt_root [ptthread.c:0a3b78002ba3 : 216 + 0x9]
[task 2017-06-09T13:44:10.098195Z] 13:44:10 INFO - eip = 0xf73b4c44 esp = 0xc30fe320 ebp = 0xc30fe368 ebx = 0xf73c56c4
[task 2017-06-09T13:44:10.099791Z] 13:44:10 INFO - esi = 0xbf045d40 edi = 0x000006fc
[task 2017-06-09T13:44:10.101360Z] 13:44:10 INFO - Found by: call frame info
[task 2017-06-09T13:44:10.102954Z] 13:44:10 INFO - 2 libpthread-2.23.so + 0x6295
[task 2017-06-09T13:44:10.104620Z] 13:44:10 INFO - eip = 0xf777f295 esp = 0xc30fe370 ebp = 0xc30fe428 ebx = 0x00000000
[task 2017-06-09T13:44:10.106366Z] 13:44:10 INFO - esi = 0x00000000 edi = 0x003d0f00
[task 2017-06-09T13:44:10.107927Z] 13:44:10 INFO - Found by: call frame info
[task 2017-06-09T13:44:10.109510Z] 13:44:10 INFO - 3 libc-2.23.so + 0xe6eee
[task 2017-06-09T13:44:10.111173Z] 13:44:10 INFO - eip = 0xf74b1eee esp = 0xc30fe430 ebp = 0x00000000
[task 2017-06-09T13:44:10.114172Z] 13:44:10 INFO - Found by: previous frame's frame pointer
[task 2017-06-09T13:44:10.115711Z] 13:44:10 INFO -
[task 2017-06-09T13:44:10.117257Z] 13:44:10 INFO - Thread 0
[task 2017-06-09T13:44:10.119043Z] 13:44:10 INFO - 0 libxul.so!XPCRootSetElem::RemoveFromRootSet [XPCJSRuntime.cpp:0a3b78002ba3 : 3171 + 0x5]
[task 2017-06-09T13:44:10.122247Z] 13:44:10 INFO - eip = 0xf0c874eb esp = 0xffbed7d0 ebp = 0xffbed7e8 ebx = 0xf5615000
[task 2017-06-09T13:44:10.123895Z] 13:44:10 INFO - esi = 0xcee9a958 edi = 0xcee9a940 eax = 0xceee6058 ecx = 0xf71352cc
[task 2017-06-09T13:44:10.125572Z] 13:44:10 INFO - edx = 0xdf1fc898 efl = 0x00010286
[task 2017-06-09T13:44:10.127406Z] 13:44:10 INFO - Found by: given as instruction pointer in context
[task 2017-06-09T13:44:10.129738Z] 13:44:10 INFO - 1 libxul.so!nsXPCWrappedJS::Release [XPCWrappedJS.cpp:0a3b78002ba3 : 281 + 0xe]
[task 2017-06-09T13:44:10.133705Z] 13:44:10 INFO - eip = 0xf0ca93a5 esp = 0xffbed7f0 ebp = 0xffbed838 ebx = 0xf5615000
[task 2017-06-09T13:44:10.135440Z] 13:44:10 INFO - esi = 0xcee9a948 edi = 0xcee9a940
[task 2017-06-09T13:44:10.137009Z] 13:44:10 INFO - Found by: call frame info
[task 2017-06-09T13:44:10.138695Z] 13:44:10 INFO - 2 libxul.so!nsXPTCStubBase::Release [xptcall.cpp:0a3b78002ba3 : 37 + 0xe]
[task 2017-06-09T13:44:10.140335Z] 13:44:10 INFO - eip = 0xf031ba04 esp = 0xffbed840 ebp = 0xffbed858 ebx = 0xf5615000
[task 2017-06-09T13:44:10.141965Z] 13:44:10 INFO - esi = 0xcc8ec220 edi = 0xcc8ec218
[task 2017-06-09T13:44:10.143689Z] 13:44:10 INFO - Found by: call frame info
[task 2017-06-09T13:44:10.146701Z] 13:44:10 INFO - 3 libxul.so!nsTArray_Impl<ObserverRef, nsTArrayInfallibleAllocator>::RemoveElementsAt [nsCOMPtr.h:0a3b78002ba3 : 294 + 0x8]
[task 2017-06-09T13:44:10.148483Z] 13:44:10 INFO - eip = 0xf02c626f esp = 0xffbed860 ebp = 0xffbed888 ebx = 0xf5615000
[task 2017-06-09T13:44:10.150144Z] 13:44:10 INFO - esi = 0xcc8ec220 edi = 0xcc8ec218
[task 2017-06-09T13:44:10.152087Z] 13:44:10 INFO - Found by: call frame info
[task 2017-06-09T13:44:10.154216Z] 13:44:10 INFO - 4 libxul.so!nsTHashtable<nsObserverList>::s_ClearEntry [nsTArray.h:0a3b78002ba3 : 1738 + 0x12]
[task 2017-06-09T13:44:10.157955Z] 13:44:10 INFO - eip = 0xf02c658b esp = 0xffbed890 ebp = 0xffbed8b8 ebx = 0xf5615000
[task 2017-06-09T13:44:10.159553Z] 13:44:10 INFO - esi = 0xe36132b4 edi = 0xe36132bc
[task 2017-06-09T13:44:10.161115Z] 13:44:10 INFO - Found by: call frame info
so we are failing on shutdown?
:mcmanus, can you help find someone to look into this failure in the next 2 weeks?
Flags: needinfo?(mcmanus)
Whiteboard: [necko-backlog] → [necko-backlog][stockwell needswork]
Comment 80•8 years ago
|
||
Hey Joel - so I don't see why this was marked a cache bug 2 years ago. I took a look at the following
https://public-artifacts.taskcluster.net/S95MYI02S8Sg1Dn2k7vOOg/0/public/logs/live_backing.log
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=105826337&lineNumber=3065
https://public-artifacts.taskcluster.net/ABSg1wEAR0agSqyDcUgXKA/0/public/logs/live_backing.log
and all of them are processing shutdown.. in all of them the networking thread has completed and I don't easily see any cache thread either. I'll NI honza to confirm the networking cache is not involved.
I'm not sure where to reassign either.. none of them look deadlocked, just usually doing some js gc cleanup at different stages.
Flags: needinfo?(mcmanus) → needinfo?(honzab.moz)
Comment hidden (Intermittent Failures Robot) |
![]() |
||
Comment 82•8 years ago
|
||
Confirming the shutdown crashes (the only UNEXPECTED-FAIL I was able to find in those 3 logs) are not related to http cache.
Flags: needinfo?(honzab.moz)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 85•8 years ago
|
||
Mike- this is always occurring at the end of sessionrestore, could you help find someone on the sessionrestore team to look into why we are crashing often on linux32/opt non-e10s browser-chrome tests for sessionrestore?
Component: Networking: Cache → Session Restore
Flags: needinfo?(mdeboer)
Product: Core → Firefox
Comment 86•8 years ago
|
||
Why are you sure that this is occurring due to sessionstore? What do you mean with 'end of sessionrestore'?
When I look at the brasstacks view, it looks like bug 1373116, but in disguise. Let's ask Yoshi again!
Flags: needinfo?(mdeboer) → needinfo?(allstars.chh)
Comment hidden (Intermittent Failures Robot) |
![]() |
||
Comment 88•8 years ago
|
||
(In reply to Mike de Boer [:mikedeboer] from comment #86)
> Why are you sure that this is occurring due to sessionstore? What do you
> mean with 'end of sessionrestore'?
> When I look at the brasstacks view, it looks like bug 1373116, but in
> disguise. Let's ask Yoshi again!
Most of the linux32 opt failures reported around June 14 are nsTerminator shutdown crashes which occur at the end of the browser/components/sessionstore/test tests. That is indeed just like bug 1373116 (I think the bug 1373116 failures were reported here until bug 1373116 was opened).
Beginning around June 19, we started seeing similar failures on Windows reported here, but those do not seem to be associated with browser/components/sessionstore/test, or any other particular test directory.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 91•8 years ago
|
||
this appears to be fixed
Flags: needinfo?(allstars.chh)
Whiteboard: [necko-backlog][stockwell needswork] → [necko-backlog][stockwell unknown]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 95•7 years ago
|
||
August 28th, this started failing, 21 times since then:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1108587
primarily on win7/debug and linux64/asan configurations.
Updated•7 years ago
|
Crash Signature: [@ RunWatchdog]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 112•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P3 → P1
Comment 113•7 years ago
|
||
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 121•7 years ago
|
||
In the last 7 days there have been 33 failures.
Most of the failures occur on the Linux x64 platform. There are also some that occur on Windows 7 and windows10-64. A very low number occur on oher platforms like: .
OS X 10.10, windows7-32-stylo-disabled.
The failures occur mostly on the asan build type. There are also a few failures on debug and a single one on opt.
Here is an example of a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=141231965&lineNumber=36298
And a relevant snippet of the log:
[task 2017-11-01T03:45:49.856Z] 03:45:49 INFO - GECKO(3102) | ASAN:DEADLYSIGNAL
36297
[task 2017-11-01T03:45:49.858Z] 03:45:49 INFO - GECKO(3102) | =================================================================
36298
[task 2017-11-01T03:45:49.859Z] 03:45:49 ERROR - GECKO(3102) | ==3102==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7efc30b950be bp 0x7efbd5c5adf0 sp 0x7efbd5c5ade0 T180)
36299
[task 2017-11-01T03:45:49.859Z] 03:45:49 INFO - GECKO(3102) | ==3102==The signal is caused by a WRITE memory access.
36300
[task 2017-11-01T03:45:49.859Z] 03:45:49 INFO - GECKO(3102) | ==3102==Hint: address points to the zero page.
36301
[task 2017-11-01T03:45:50.243Z] 03:45:50 INFO - GECKO(3102) | #0 0x7efc30b950bd in mozilla::(anonymous namespace)::RunWatchdog(void*) /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:163:5
36302
[task 2017-11-01T03:45:50.245Z] 03:45:50 INFO - GECKO(3102) | #1 0x7efc412d74d3 in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:216:5
36303
[task 2017-11-01T03:45:50.253Z] 03:45:50 INFO - GECKO(3102) | #2 0x7efc4564e6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
36304
[task 2017-11-01T03:45:50.302Z] 03:45:50 INFO - GECKO(3102) | #3 0x7efc446d73dc in clone /build/glibc-bfm8X4/glibc-2.23/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:109
36305
[task 2017-11-01T03:45:50.303Z] 03:45:50 INFO - GECKO(3102) | AddressSanitizer can not provide additional info.
36306
[task 2017-11-01T03:45:50.304Z] 03:45:50 INFO - GECKO(3102) | SUMMARY: AddressSanitizer: SEGV /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:163:5 in mozilla::(anonymous namespace)::RunWatchdog(void*)
:mikedeboer, we've noticed that in the last days the failures increased. Do you think you can find anyone who can look into this?
Flags: needinfo?(mdeboer)
Whiteboard: [necko-backlog][stockwell unknown] → [necko-backlog][stockwell needswork]
Comment hidden (Intermittent Failures Robot) |
Comment 123•7 years ago
|
||
Hi Tiberius, sorry I can't help you with this specific bug. In fact, Joel triaged this to be part of Session Restore, but that doesn't seem to be true at all... right?
I'm wary of being the one to ping-pong stuff around, but I can't tell anything from the logs and I don't have the necessary knowledge to gather enough useful data when I look at how low level the failure actually is.
Joel, wouldn't it be more reasonable to find a platform engineer who may be able to do a preliminary analysis from the bottom up?
Flags: needinfo?(mdeboer) → needinfo?(jmaher)
Comment 124•7 years ago
|
||
given that we seem to be failing on toolkit/components/terminator/nsTerminator.cpp, moving to toolkit::async tooling- ideally we can get more info there.
Component: Session Restore → Async Tooling
Flags: needinfo?(jmaher)
Product: Firefox → Toolkit
Comment 125•7 years ago
|
||
:yoric, could you help figure out the runWatchdog crash we are seeing in comment 121?
Flags: needinfo?(dteller)
Assignee | ||
Comment 126•7 years ago
|
||
Well, as usual with nsTerminator, this means that shutdown has timed out and we killed Firefox. Normally, nsTerminator should provide a crash report with slightly more details (i.e. which shutdown phase lasted more than 63 seconds), but this doesn't seem to show up in the log. I suppose we could patch the nsTerminator to dump this info to stderr on DEBUG builds, but I don't think that this would help much.
I'll be happy to answer any question, but I suspect that I won't be able to help much.
Flags: needinfo?(dteller)
Comment 127•7 years ago
|
||
as this is heavily weighted on linux64-asan, do you think we could adjust timing for slow configurations? I think right now we are stuck with a dead end, so if there is any additional information we could get it would be helpful!
Assignee | ||
Comment 128•7 years ago
|
||
I suppose we could use a different constant in ASAN builds, if one single step of shutdown takes more than 1 wallclock minute. Do we have a build flag for ASAN?
Assignee | ||
Comment 129•7 years ago
|
||
Ok, MOZ_ASAN. I'll try and build a patch to make nsTerminator more ASAN-friendly. I suspect that this will just change the crashes into harness timeouts, but we can try.
Comment 130•7 years ago
|
||
I think we can do MOZ_ASAN || MOZ_DEBUG, I see these referenced here:
http://searchfox.org/mozilla-central/source/js/src/jsutil.h#334
Comment hidden (mozreview-request) |
Comment 132•7 years ago
|
||
mozreview-review |
Comment on attachment 8926895 [details]
Bug 1108587 - Extending the grace period of AsyncShutdown and the nsTerminator for ASAN builds;
https://reviewboard.mozilla.org/r/198140/#review203386
LGTM, I've seen ASAN try runs turn orange so many times that anything that makes them greener is appreciated.
Attachment #8926895 -
Flags: review?(gsvelto) → review+
Comment 133•7 years ago
|
||
Pushed by dteller@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/56e1ecb8bb58
Extending the grace period of AsyncShutdown and the nsTerminator for ASAN builds;r=gsvelto
Comment 134•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox58:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Updated•7 years ago
|
Assignee: nobody → dteller
status-firefox56:
--- → wontfix
status-firefox57:
--- → wontfix
status-firefox-esr52:
--- → wontfix
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Whiteboard: [necko-backlog][stockwell needswork] → [necko-backlog][stockwell fixed:other]
Updated•7 years ago
|
Blocks: RunWatchdogShutdownhang
No longer depends on: RunWatchdogShutdownhang
You need to log in
before you can comment on or make changes to this bug.
Description
•