Closed Bug 1497309 Opened 7 years ago Closed 7 years ago

Please upgrade generic-worker on localprovisioner/nss-macos-10-12 to version 11.0.1

Categories

(NSS :: Build, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pmoore, Assigned: pmoore)

Details

It looks like the mac workers are currently running generic-worker 10.2.3: https://tools.taskcluster.net/groups/V-l02xoETLKBx7O7RCsZNQ/tasks/MvhqwpY4Q024AHGm8CgZ8A/runs/0/logs/public%2Flogs%2Flive_backing.log#L12 This release is a little over a year-old[1] so it would be great to upgrade this to a more recent release, such as 11.0.1. Is there a staging environment I could test this on? Thanks!
Flags: needinfo?(franziskuskiefer)
No staging unfortunately. But traffic is low enough to do it on the existing machine (you should still have access) and nss-try. You could also use the second machine for testing if you want (it's not used right now). Ping me if you need any help.
Flags: needinfo?(franziskuskiefer)
Thanks Franziskus. Funnily enough, we just had a sentry crash report come in for worker nss1-1/macosstadium on host administrators-Mac-mini-98.local, so probably an upgrade would be good. https://sentry.prod.mozaws.net/operations/generic-worker/issues/4849688/events/24847726/json/ runtime error: invalid memory address or nil pointer dereference /home/travis/go/src/runtime/panic.go in gopanic at line 489 /home/travis/gopath/src/github.com/taskcluster/generic-worker/sentry.go in func1 at line 36 /home/travis/gopath/src/github.com/getsentry/raven-go/client.go in CapturePanicAndWait at line 745 /home/travis/gopath/src/github.com/taskcluster/generic-worker/sentry.go in ReportCrashToSentry at line 47 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in HandleCrash at line 499 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in func1 at line 505 /home/travis/go/src/runtime/asm_amd64.s in call32 at line 514 /home/travis/go/src/runtime/panic.go in gopanic at line 489 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in func1 at line 982 /home/travis/go/src/runtime/asm_amd64.s in call32 at line 514 /home/travis/go/src/runtime/panic.go in gopanic at line 489 /home/travis/go/src/runtime/panic.go in panicmem at line 63 /home/travis/go/src/runtime/signal_unix.go in sigpanic at line 290 /home/travis/gopath/src/github.com/taskcluster/taskcluster-client-go/http.go in String at line 46 /home/travis/gopath/src/github.com/taskcluster/generic-worker/artifacts.go in uploadArtifact at line 438 /home/travis/gopath/src/github.com/taskcluster/generic-worker/artifacts.go in uploadLog at line 419 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in func3 at line 1014 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in Run at line 1126 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in FindAndRunTask at line 674 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in RunWorker at line 569 /home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go in main at line 332 /home/travis/go/src/runtime/proc.go in main at line 185 This error was caused in taskcluster-client-go, which was called by this line in generic-worker: https://github.com/taskcluster/generic-worker/blob/v10.2.3/artifacts.go#L438 438 log.Print(t.CallSummary.String()) Based on the date+time of release of generic-worker v10.2.3, we can map to the corresponding commit of taskcluster-client-go, and find the line that caused the failure: https://github.com/taskcluster/taskcluster-client-go/blob/1b02af5dfac584c998413247dda6ef1a8e2175b5/http.go#L46 46 return fmt.Sprintf("\nCALL SUMMARY\n============\nRequest Headers:\n%#v\nRequest Body:\n%v\nResponse Headers:\n%#v\nResponse Body:\n%v\nAttempts: %v", cs.HTTPRequest.Header, cs.HTTPRequestBody, cs.HTTPResponse.Header, cs.HTTPResponseBody, cs.Attempts) From this we can determine that the problem was that one of the following values was nil: * cs.HTTPRequest * cs.HTTPResponse From inspecting the code, there are several failure cases where cs.HTTPResponse would be nil. The most likely candidate I see is here: * https://github.com/taskcluster/taskcluster-client-go/blob/1b02af5dfac584c998413247dda6ef1a8e2175b5/http.go#L116-L120 If the http connection cannot be made after several retries, the http response could potentially be nil. The fix is to not assume that cs.HTTPRequest and cs.HTTPResponse are non-nil. This was done in the following commit: https://github.com/taskcluster/taskcluster-client-go/commit/1f7c623bcb7e78c1f1237e7c5535b3f4409ff23e In other words, this worker crash is fixed, and we can see from release timestamps, that this fix would have made it into generic-worker 10.7.8.
Assignee: nobody → pmoore
Currently no tasks are running, so I will perform the upgrade now...
Upgraded to generic-worker 11.0.1: > administrators-Mac-mini-98:~ administrator$ /usr/local/bin/generic-worker --version > generic-worker 11.0.1 [ revision: https://github.com/taskcluster/generic-worker/commits/a0a5271ddad42022606ca47a6c16f84c128223ab ]
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Summary: Please upgrade generic-worker on localprovisioner/nss-macos-10-12 → Please upgrade generic-worker on localprovisioner/nss-macos-10-12 to version 11.0.1
You need to log in before you can comment on or make changes to this bug.