Closed
Bug 1299814
Opened 9 years ago
Closed 8 years ago
Intermittent [taskcluster:error] Task was aborted because states could not be created successfully. Error calling 'link' for taskclusterProxy : Failed to initialize taskcluster proxy service.
Categories
(Taskcluster :: General, defect)
Taskcluster
General
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: intermittent-bug-filer, Unassigned)
References
Details
(Keywords: intermittent-failure, Whiteboard: [docker-link-failure][stockwell infra][fennec-scouting])
Comment 1•9 years ago
|
||
I'm guessing this shares a common cause or remediation with bug 1285090, which has a lot more stars.
Updated•9 years ago
|
Whiteboard: [docker-error-pulling]
Updated•9 years ago
|
Whiteboard: [docker-error-pulling] → [docker-link-failure]
| Comment hidden (Intermittent Failures Robot) |
Updated•9 years ago
|
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 6•9 years ago
|
||
:dustin, this error spiked recently- can you take a stab at fixing this?
Flags: needinfo?(dustin)
Comment 8•9 years ago
|
||
A large majority of the ones I'm seeing for the last couple of days seem to be due to a 503 error returned from heroku, which seems to be a request timeout. I've asked Jonas to take a look at determining root cause. We do use a client which retries before failing completely.
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 14•9 years ago
|
||
Greg, as a note, this spiked yesterday- not sure if this is a one time trend, or a larger issue
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 20•9 years ago
|
||
I noticed we had a server side taskcluster-proxy crash which caused https://tools.taskcluster.net/task-inspector/#Q1DigDtfSc-vq55qXNPSNA/0 failure:
From https://papertrailapp.com/systems/686520052/events?highlight=765432606006349830&focus=765432606006349830
unexpected fault address 0x0#015
fatal error: fault#015
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x45bc79]#015
#015
goroutine 11 [running]:#015
runtime.throw(0x6c8b59, 0x5)#015
#011/usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc42018cc98 sp=0xc42018cc78#015
runtime.sigpanic()#015
#011/usr/local/go/src/runtime/sigpanic_unix.go:27 +0x288 fp=0xc42018ccf0 sp=0xc42018cc98#015
runtime.memmove(0xc4201f9fe0, 0x0, 0x1e0d8ce2189f0a78)#015
#011/usr/local/go/src/runtime/memmove_amd64.s:129 +0x1e9 fp=0xc42018ccf8 sp=0xc42018ccf0#015
reflect.typedslicecopy(0x667840, 0xc4201f9fe0, 0x4, 0x4, 0x0, 0xa3c1b19c4313e14f, 0x57f6bc6680176e91, 0x412935)#015
#011/usr/local/go/src/runtime/mbarrier.go:303 +0x6a fp=0xc42018cd50 sp=0xc42018ccf8#015
reflect.Copy(0x680260, 0xc420080630, 0x197, 0x663a80, 0xc420202000, 0x97, 0x97)#015
#011/usr/local/go/src/reflect/value.go:1873 +0x1e0 fp=0xc42018cdf0 sp=0xc42018cd50#015
encoding/asn1.parseField(0x680260, 0xc420080630, 0x197, 0xc42016737e, 0x1d, 0x153, 0x0, 0x0, 0x0, 0x0, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:775 +0x1f6b fp=0xc42018d4d0 sp=0xc42018cdf0#015
encoding/asn1.parseField(0x6987c0, 0xc420080630, 0x199, 0xc42016737c, 0x40, 0x155, 0x0, 0x0, 0x0, 0x0, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:856 +0x6c7 fp=0xc42018dbb0 sp=0xc42018d4d0#015
encoding/asn1.parseSequenceOf(0xc42016737c, 0x40, 0x155, 0x7edec0, 0x663700, 0x7edec0, 0x6987c0, 0x1, 0x1f4, 0x0, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:563 +0x485 fp=0xc42018dcf0 sp=0xc42018dbb0#015
encoding/asn1.parseField(0x663700, 0xc4201fe538, 0x197, 0xc420167188, 0x234, 0x349, 0x1f0, 0x101, 0x0, 0xc4201fb2b8, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:872 +0x1161 fp=0xc42018e3d0 sp=0xc42018dcf0#015
encoding/asn1.parseField(0x6b8580, 0xc4201fe318, 0x199, 0xc420167184, 0x34c, 0x34d, 0x0, 0x0, 0x0, 0x0, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:856 +0x6c7 fp=0xc42018eab0 sp=0xc42018e3d0#015
encoding/asn1.parseField(0x6a2240, 0xc4201fe300, 0x199, 0xc420167180, 0x350, 0x351, 0x0, 0x0, 0x0, 0x0, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:856 +0x6c7 fp=0xc42018f190 sp=0xc42018eab0#015
encoding/asn1.UnmarshalWithParams(0xc420167180, 0x350, 0x351, 0x65ab40, 0xc4201fe300, 0x0, 0x0, 0x411e68, 0x300, 0x6a2240, ...)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:995 +0x14f fp=0xc42018f268 sp=0xc42018f190#015
encoding/asn1.Unmarshal(0xc420167180, 0x350, 0x351, 0x65ab40, 0xc4201fe300, 0xc42019bb21, 0xc4201eb0b0, 0xc4201eb080, 0xc42019bb21, 0x1)#015
#011/usr/local/go/src/encoding/asn1/asn1.go:988 +0x72 fp=0xc42018f2d8 sp=0xc42018f268#015
crypto/x509.ParseCertificate(0xc420167180, 0x350, 0x351, 0xb, 0xc42019bf01, 0x3efeb)#015
#011/usr/local/go/src/crypto/x509/x509.go:1193 +0x95 fp=0xc42018f350 sp=0xc42018f2d8#015
crypto/x509.(*CertPool).AppendCertsFromPEM(0xc420165a40, 0xc42019bfb9, 0x3efeb, 0x3f1eb, 0x431a4)#015
#011/usr/local/go/src/crypto/x509/cert_pool.go:108 +0x126 fp=0xc42018f3a8 sp=0xc42018f350#015
crypto/x509.loadSystemRoots(0xc42018f4c8, 0xc42018f4d0, 0x50ca8d)#015
#011/usr/local/go/src/crypto/x509/root_unix.go:31 +0x22b fp=0xc42018f490 sp=0xc42018f3a8#015
crypto/x509.initSystemRoots()#015
#011/usr/local/go/src/crypto/x509/root.go:21 +0x26 fp=0xc42018f4c8 sp=0xc42018f490#015
sync.(*Once).Do(0x821488, 0x6edc60)#015
#011/usr/local/go/src/sync/once.go:44 +0xdb fp=0xc42018f500 sp=0xc42018f4c8#015
crypto/x509.systemRootsPool(0x0)#015
#011/usr/local/go/src/crypto/x509/root.go:16 +0x39 fp=0xc42018f520 sp=0xc42018f500#015
crypto/x509.(*Certificate).Verify(0xc420075680, 0xc420119820, 0x15, 0xc4201659b0, 0x0, 0xed02b1870, 0x11a6452a, 0x804da0, 0x0, 0x0, ...)#015
#011/usr/local/go/src/crypto/x509/verify.go:247 +0x666 fp=0xc42018f770 sp=0xc42018f520#015
crypto/tls.(*clientHandshakeState).doFullHandshake(0xc42018fe08, 0xc42016c3c0, 0x59)#015
#011/usr/local/go/src/crypto/tls/handshake_client.go:300 +0x221f fp=0xc42018fbf0 sp=0xc42018f770#015
crypto/tls.(*Conn).clientHandshake(0xc420166a80, 0x6ee660, 0xc420166b88)#015
#011/usr/local/go/src/crypto/tls/handshake_client.go:228 +0xfd1 fp=0xc42018fec0 sp=0xc42018fbf0#015
crypto/tls.(*Conn).Handshake(0xc420166a80, 0x0, 0x0)#015
#011/usr/local/go/src/crypto/tls/conn.go:1260 +0x1b8 fp=0xc42018ff30 sp=0xc42018fec0#015
net/http.(*Transport).dialConn.func3(0xc420166a80, 0xc42000d780, 0xc42016c300)#015
#011/usr/local/go/src/net/http/transport.go:1033 +0x2f fp=0xc42018ff78 sp=0xc42018ff30#015
runtime.goexit()#015
#011/usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc42018ff80 sp=0xc42018ff78#015
created by net/http.(*Transport).dialConn#015
#011/usr/local/go/src/net/http/transport.go:1038 +0xb4f#015
#015
goroutine 1 [select]:#015
net/http.(*Transport).getConn(0xc42000a1e0, 0xc420119800, 0x0, 0xc42000d100, 0x5, 0xc420119820, 0x19, 0x0, 0x0, 0x8059c0)#015
#011/usr/local/go/src/net/http/transport.go:890 +0x9cc#015
net/http.(*Transport).RoundTrip(0xc42000a1e0, 0xc42000aff0, 0xc42000a1e0, 0x0, 0xc400000000)#015
#011/usr/local/go/src/net/http/transport.go:367 +0x307#015
net/http.send(0xc42000aff0, 0x7e5dc0, 0xc42000a1e0, 0x0, 0x0, 0x0, 0x8, 0xc4200334d8, 0xc420020538)#015
#011/usr/local/go/src/net/http/client.go:256 +0x15f#015
net/http.(*Client).send(0xc420033770, 0xc42000aff0, 0x0, 0x0, 0x0, 0xc420020538, 0x0, 0x1)#015
#011/usr/local/go/src/net/http/client.go:146 +0x102#015
net/http.(*Client).doFollowingRedirects(0xc420033770, 0xc42000aff0, 0x6ee190, 0x3, 0x1, 0x0)#015
#011/usr/local/go/src/net/http/client.go:528 +0x5e5#015
net/http.(*Client).Do(0xc420033770, 0xc42000aff0, 0x0, 0x0, 0x10)#015
#011/usr/local/go/src/net/http/client.go:184 +0x1ea#015
github.com/taskcluster/taskcluster-client-go.(*ConnectionData).Request.func1(0xc420033800, 0x45c033, 0x58992170, 0x9acfaa2, 0xc420033820)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/taskcluster-client-go/http.go:87 +0x32f#015
github.com/taskcluster/httpbackoff.(*Client).Retry.func1(0xc420047800, 0x411e68)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/httpbackoff/httpbackoff.go:86 +0x6c#015
github.com/cenkalti/backoff.RetryNotify(0xc420049ce0, 0x7e8180, 0xc420047800, 0x6ede30, 0xc41fffdc42, 0xc4200339d0)#015
#011/home/jonasfj/Mozilla/go/src/github.com/cenkalti/backoff/retry.go:32 +0x3f#015
github.com/taskcluster/httpbackoff.(*Client).Retry(0x803830, 0xc4200477a0, 0x60, 0x6acb40, 0x1, 0xc4200477a0)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/httpbackoff/httpbackoff.go:125 +0x21b#015
github.com/taskcluster/httpbackoff.Retry(0xc4200477a0, 0xc4200477a0, 0x0, 0x0, 0x0)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/httpbackoff/httpbackoff.go:139 +0x37#015
github.com/taskcluster/taskcluster-client-go.(*ConnectionData).Request(0xc4201194a0, 0x821418, 0x0, 0x0, 0x6c8557, 0x3, 0xc4201194c0, 0x1c, 0x0, 0x0, ...)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/taskcluster-client-go/http.go:93 +0x198#015
github.com/taskcluster/taskcluster-client-go.(*ConnectionData).APICall(0xc4201194a0, 0x0, 0x0, 0x6c8557, 0x3, 0xc4201194c0, 0x1c, 0x65b7c0, 0xc4200b4300, 0x0, ...)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/taskcluster-client-go/http.go:139 +0x129#015
github.com/taskcluster/taskcluster-client-go/queue.(*Queue).Task(0xc420033e70, 0x7fff4b3e0df5, 0x16, 0xc420033de8, 0x1, 0xc420048f01)#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/taskcluster-client-go/queue/queue.go:89 +0x180#015
main.main()#015
#011/home/jonasfj/Mozilla/go/src/github.com/taskcluster/taskcluster-proxy/main.go:92 +0x954#015
#015
goroutine 5 [chan receive]:#015
net/http.(*Transport).dialConn(0xc42000a1e0, 0x7ea100, 0xc4200123c8, 0x0, 0xc42000d100, 0x5, 0xc420119820, 0x19, 0x0, 0x0, ...)#015
#011/usr/local/go/src/net/http/transport.go:1039 +0xb91#015
net/http.(*Transport).getConn.func4(0xc42000a1e0, 0x7ea100, 0xc4200123c8, 0xc420164330, 0xc4200479e0)#015
#011/usr/local/go/src/net/http/transport.go:885 +0x78#015
created by net/http.(*Transport).getConn#015
#011/usr/local/go/src/net/http/transport.go:887 +0x3a1#015
Comment 21•9 years ago
|
||
this spiked up greatly starting last night and this morning, can we document what happened here?
Flags: needinfo?(garndt)
| Comment hidden (Intermittent Failures Robot) |
Comment 23•9 years ago
|
||
Spot checking about a dozen of these it is largely concentrated around retrieving the task definition when using the taskclusterProxy. Yesterday we were having a lot of timeouts within the taskcluster-queue that has been since addressed. These failures align with the period of time we were having timeouts.
While investigating this, we did identify two things that should be addressed (but were not the root cause of the timeouts):
https://bugzilla.mozilla.org/show_bug.cgi?id=1338611
https://bugzilla.mozilla.org/show_bug.cgi?id=1338630
Flags: needinfo?(garndt)
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Updated•9 years ago
|
Whiteboard: [docker-link-failure] → [docker-link-failure][stockwell infra]
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 32•8 years ago
|
||
a pickup in aurora failures, but not enough to get me to investigate more.
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Updated•8 years ago
|
Whiteboard: [docker-link-failure][stockwell infra] → [docker-link-failure][stockwell infra][fennec-scouting]
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 49•8 years ago
|
||
Going to close this out, last failure was last month and we have since implemented better retry mechanisms.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•