Closed
Bug 1141107
Opened 10 years ago
Closed 10 years ago
Only slaves broken by our inability to update talos hit tsvgx,tresize,tp5o_scroll | application crashed [@ nsSocketTransport::InitiateSocket()]
Categories
(Testing :: Talos, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: cbook, Unassigned)
References
()
Details
(Keywords: crash, intermittent-failure, Whiteboard: [leave open])
Ubuntu HW 12.04 fx-team pgo talos svgr
https://treeherder.mozilla.org/logviewer.html#?job_id=2194969&repo=fx-team
07:51:44 INFO - PROCESS-CRASH | tsvgx | application crashed [@ nsSocketTransport::InitiateSocket()]
07:51:44 INFO - Crash dump filename: /tmp/tmpjNnTOS/profile/minidumps/25db96e2-9a6a-5c8d-7f85514e-2fe7f7ee.dmp
07:51:44 INFO - Operating system: Linux
07:51:44 INFO - 0.0.0 Linux 3.2.0-76-generic-pae #111-Ubuntu SMP Tue Jan 13 22:34:29 UTC 2015 i686
07:51:44 INFO - CPU: x86
07:51:44 INFO - GenuineIntel family 6 model 30 stepping 5
07:51:44 INFO - 8 CPUs
07:51:44 INFO - Crash reason: SIGSEGV
07:51:44 INFO - Crash address: 0x0
07:51:44 INFO - Thread 5 (crashed)
07:51:44 INFO - 0 libxul.so!nsSocketTransport::InitiateSocket() [nsSocketTransport2.cpp:a9e7d74b9f5b : 1226 + 0x0]
07:51:44 INFO - eip = 0xb35eebf1 esp = 0xae4fddb0 ebp = 0xae4fdf58 ebx = 0xb6f06140
07:51:44 INFO - esi = 0xb7645d9c edi = 0xb559fb5c eax = 0x00000172 ecx = 0x00000000
07:51:44 INFO - edx = 0xae4fd770 efl = 0x00210286
07:51:44 INFO - Found by: given as instruction pointer in context
07:51:44 INFO - 1 libxul.so!nsSocketTransport::OnSocketEvent(unsigned int, nsresult, nsISupports*) [nsSocketTransport2.cpp:a9e7d74b9f5b : 1765 + 0x7]
07:51:44 INFO - eip = 0xb35eeeac esp = 0xae4fdf60 ebp = 0xae4fdfa8 ebx = 0xb6f06140
07:51:44 INFO - esi = 0x9dace900 edi = 0xb3610e40
07:51:44 INFO - Found by: call frame info
07:51:44 INFO - 2 libxul.so!nsSocketEvent::Run() [nsSocketTransport2.cpp:a9e7d74b9f5b : 79 + 0x1f]
07:51:44 INFO - eip = 0xb35ef149 esp = 0xae4fdfb0 ebp = 0xae4fdfc8 ebx = 0xb6f06140
07:51:44 INFO - esi = 0xb722c6c0 edi = 0xb7237a88
07:51:44 INFO - Found by: call frame info
07:51:44 INFO - 3 libxul.so!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:a9e7d74b9f5b : 855 + 0x1]
07:51:44 INFO - eip = 0xb323ccc8 esp = 0xae4fdfd0 ebp = 0xae4fe058 ebx = 0xb6f06140
07:51:44 INFO - esi = 0xb722c6c0 edi = 0xb7237a88
07:51:44 INFO - Found by: call frame info
07:51:44 INFO - 4 libxul.so!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:a9e7d74b9f5b : 265 + 0x12]
07:51:44 INFO - eip = 0xb3245a25 esp = 0xae4fe060 ebp = 0xae4fe098 ebx = 0xb6f06140
07:51:44 INFO - esi = 0xae4fe08c edi = 0xb7237a80
07:51:44 INFO - Found by: call frame info
07:51:44 INFO - 5 libxul.so!nsSocketTransportService::Run() [nsSocketTransportService2.cpp:a9e7d74b9f5b : 768 + 0xf]
07:51:44 INFO - eip = 0xb35feb0c esp = 0xae4fe0a0 ebp = 0xae4fe108 ebx = 0xb6f06140
| Reporter | ||
Updated•10 years ago
|
OS: Mac OS X → Linux
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
The actual failure is in reaching self-repair.mozilla.org:
07:51:24 INFO - mozversion INFO | application_repository: https://hg.mozilla.org/integration/fx-team
07:51:24 INFO - mozversion INFO | application_version: 39.0a1
07:51:24 INFO - mozversion INFO | platform_buildid: 20150309053012
07:51:24 INFO - mozversion INFO | platform_changeset: a9e7d74b9f5b
07:51:24 INFO - mozversion INFO | platform_repository: https://hg.mozilla.org/integration/fx-team
07:51:24 INFO - DEBUG : initialized firefox
07:51:24 INFO - DEBUG : command line: /builds/slave/talos-slave/test-pgo/build/application/firefox/firefox -profile /tmp/tmpjNnTOS/profile -tp file:/home/cltbld/talos-slave/test/build/venv/lib/python2.7/site-packages/talos/page_load_test/svgx/svgx.manifest -tpchrome -tpnoisy -tpcycles 1 -tppagecycles 25
07:51:31 INFO - INFO : Browser exited with error code: 11
07:51:36 INFO - INFO : FATAL ERROR: Non-local network connections are disabled and a connection attempt to self-repair.mozilla.org (54.192.119.182) was made.
07:51:36 INFO - You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
07:51:36 INFO - DEBUG : Terminating: firefox, plugin-container, crashreporter
07:51:36 INFO - mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/fx-team-linux-pgo/1425904212/firefox-39.0a1.en-US.linux-i686.crashreporter-symbols.zip
07:51:44 INFO - mozcrash INFO | Saved minidump as /builds/slave/talos-slave/test-pgo/build/blobber_upload_dir/25db96e2-9a6a-5c8d-7f85514e-2fe7f7ee.dmp
07:51:44 INFO - mozcrash INFO | Saved app info as /builds/slave/talos-slave/test-pgo/build/blobber_upload_dir/25db96e2-9a6a-5c8d-7f85514e-2fe7f7ee.extra
07:51:44 INFO - __metrics Screen width/height:1600/1200
07:51:44 INFO - colorDepth:24
07:51:44 INFO - Browser inner width/height: 1024/697
07:51:44 INFO - __metrics
07:51:44 INFO - JavaScript error: resource:///modules/WebappManager.jsm, line 48: NS_ERROR_FAILURE: Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]
07:51:44 INFO - *************************
07:51:44 INFO - A coding exception was thrown and uncaught in a Task.
07:51:44 INFO - Full message: ReferenceError: ProfileTimesAccessor is not defined
07:51:44 INFO - Full stack: this.TelemetryEnvironment._getProfile<@resource://gre/modules/TelemetryEnvironment.jsm:446:9
07:51:44 INFO - TaskImpl_run@resource://gre/modules/Task.jsm:314:40
07:51:44 INFO - TaskImpl@resource://gre/modules/Task.jsm:275:3
07:51:44 INFO - createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:249:14
07:51:44 INFO - this.TelemetryEnvironment._doGetEnvironmentData</sections.profile@resource://gre/modules/TelemetryEnvironment.jsm:918:24
07:51:44 INFO - this.TelemetryEnvironment._doGetEnvironmentData<@resource://gre/modules/TelemetryEnvironment.jsm:931:25
07:51:44 INFO - TaskImpl_run@resource://gre/modules/Task.jsm:314:40
07:51:44 INFO - TaskImpl_handleResultValue@resource://gre/modules/Task.jsm:393:7
07:51:44 INFO - TaskImpl_run@resource://gre/modules/Task.jsm:322:13
07:51:44 INFO - TaskImpl_handleResultValue@resource://gre/modules/Task.jsm:393:7
07:51:44 INFO - TaskImpl_run@resource://gre/modules/Task.jsm:322:13
07:51:44 INFO - TaskImpl@resource://gre/modules/Task.jsm:275:3
07:51:44 INFO - createAsyncFunction/asyncFunction@resource://gre/modules/Task.jsm:249:14
07:51:44 INFO - this.TelemetryEnvironment.getEnvironmentData@resource://gre/modules/TelemetryEnvironment.jsm:904:25
07:51:44 INFO - assemblePing@resource://gre/modules/TelemetryPing.jsm:348:14
07:51:44 INFO - savePendingPings@resource://gre/modules/TelemetryPing.jsm:443:12
07:51:44 INFO - this.TelemetryPing<.savePendingPings@resource://gre/modules/TelemetryPing.jsm:191:12
07:51:44 INFO - savePendingPingsClassic@resource://gre/modules/TelemetrySession.jsm:1302:12
07:51:44 INFO - savePendingPings/<@resource://gre/modules/TelemetrySession.jsm:1286:37
07:51:44 INFO - Handler.prototype.process@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:867:23
07:51:44 INFO - this.PromiseWalker.walkerLoop@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:746:7
07:51:44 INFO - this.PromiseWalker.scheduleWalkerLoop/<@resource://gre/modules/Promise.jsm -> resource://gre/modules/Promise-backend.js:688:37
07:51:44 INFO - Spinner.prototype.observe@resource://gre/modules/AsyncShutdown.jsm:464:9
07:51:44 INFO - *************************
07:51:44 ERROR - JavaScript error: resource:///modules/CustomizableUI.jsm, line 1568: TypeError: aWindowPalette is undefined
07:51:44 INFO - FATAL ERROR: Non-local network connections are disabled and a connection attempt to self-repair.mozilla.org (54.192.119.182) was made.
07:51:44 INFO - You should only access hostnames available via the test networking proxy (if running mochitests) or from a test-specific httpd.js server (if running xpcshell tests). Browser services should be disabled or redirected to a local server.
07:51:44 INFO - PROCESS-CRASH | tsvgx | application crashed [@ nsSocketTransport::InitiateSocket()]
Flags: needinfo?(glind)
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Summary: Intermittent tsvgx | application crashed [@ nsSocketTransport::InitiateSocket()] → Intermittent tsvgx,tresize | application crashed [@ nsSocketTransport::InitiateSocket()]
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Summary: Intermittent tsvgx,tresize | application crashed [@ nsSocketTransport::InitiateSocket()] → Intermittent tsvgx,tresize,1d8fe559384e,tp5o_scroll | application crashed [@ nsSocketTransport::InitiateSocket()]
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 13•10 years ago
|
||
Not sure what info we need from glind (we don't seem to have actually asked him any question), but the failure is "caused" by bug 1138323, fixed by bug 1138823, deployed by bug 1139328, but because we're both unable to actually successfully update talos and apparently unwilling to take that inability sufficiently seriously, I'm going to have to disable every slave which hits this. The last time we did this, or perhaps the time before last, we then did without the services of those slaves for about six months.
Let's try needinfo with an actual question: jmaher, where's that bug about making talos updates actually work, and what is blocking it from happening?
Component: General → Talos
Flags: needinfo?(glind) → needinfo?(jmaher)
Product: Core → Testing
Summary: Intermittent tsvgx,tresize,1d8fe559384e,tp5o_scroll | application crashed [@ nsSocketTransport::InitiateSocket()] → Only slaves broken by our inability to update talos hit tsvgx,tresize,tp5o_scroll | application crashed [@ nsSocketTransport::InitiateSocket()]
Comment 14•10 years ago
|
||
Turns out that's not the question to ask either, since I'm disabling the exact same set of slaves we disabled last August for failing to update talos, and just finally reimaged in February, so apparently I want to ask someone somewhere what's broken in our linux talos slave image, which causes it to create slaves which fail to update talos.
Flags: needinfo?(jmaher)
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 24•10 years ago
|
||
the talos update thing is related to mozharness and python packaging- this looks to be much different.
(I am here. Adding Alessio (:dexter), who actually the the Firefox side of this. https://s-r.m.o. *should be live and reachable*. If something in the tests is wrong, happy to work on it!
Forgive our ignorance on all aspects of pref shimming for tests! )
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 29•10 years ago
|
||
[Mass Closure] Closing bug as the WORKSFORME as the intermittent failure has not been seen for 45+ days If this has been closed and you feel that it should Not have been closed, please reopen and add [leave open] to the whiteboard.
Updated•10 years ago
|
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Updated•10 years ago
|
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Whiteboard: [leave open]
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
| Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•