Closed Bug 1359483 Opened 7 years ago Closed 7 years ago

context download fails with user-related error

Categories

(Taskcluster :: Workers, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

https://tools.taskcluster.net/task-inspector/#HzQ6dHD8QnK6sGzk74WUZA/

[taskcluster:error]  malformed-payload error: Error downloading https://hg.mozilla.org/try//raw-file/b9133b699d91eca73d6f351fcfc4c9acffaedd4f/taskcluster/scripts/tester/test-macosx.sh: Cannot lookup user cltbld: user: Lookup requires cgo
[taskcluster:error]  malformed-payload error: Error downloading https://hg.mozilla.org/try//raw-file/b9133b699d91eca73d6f351fcfc4c9acffaedd4f/taskcluster/scripts/tester/test-macosx.sh: Cannot lookup user cltbld: user: Lookup requires cgo
Blocks: 1357753
The runtime/util.Download function is:

---
package util

import (
    "io"
    "mime"
    "net/http"
    "os"
    "path/filepath"
)

// Download downloads the request file in the url
func Download(url, destdir string) (string, error) {
    resp, err := http.Get(url)

    if err != nil {
        return "", err 
    }   

    defer resp.Body.Close()

    contentDisposition := resp.Header.Get("Content-Disposition")
    _, params, err := mime.ParseMediaType(contentDisposition)

    var filename string
    if err == nil {
        filename = params["filename"]
    } else {
        filename = filepath.Base(url)
    }   

    filename = filepath.Join(destdir, filename)
    file, err := os.Create(filename)

    if err != nil {
        return "", err 
    }   

    defer file.Close()
    _, err = io.Copy(file, resp.Body)
    if err != nil {
        os.Remove(filename)
        return "", err 
    }   

    return filename, nil 
}
---

I'm not sure what would be looking up a user.  Also, I don't think Lookup requires cgo on Linux, does it?
Flags: needinfo?(jopsen)
Strange, it seems to be coming from here:

https://github.com/golang/go/blob/go1.8.1/src/os/user/lookup_stubs.go#L50

But I don't see any obvious calling paths that reach this code line.
The only place I see that calls this is:

https://github.com/golang/go/blob/go1.8.1/src/os/user/lookup.go#L14

But I can't find any code in the standard library that directly calls user.Lookup ....
Have you considered using generic-worker until taskcluster-worker is more stable?
(In reply to Pete Moore [:pmoore][:pete] from comment #4)
> Have you considered using generic-worker until taskcluster-worker is more
> stable?

Yep, that wouldn't help fix this bug :)
Adding some logging:

[cltbld@moonshot-test3 ~]$ /usr/local/bin/taskcluster-worker daemon run /etc/taskcluster-worker.yml
2017/04/26 08:41:53 Trying host-secrets server http://releng-puppet1.srv.releng.scl3.mozilla.com:8020/v1/credentials...
2017/04/26 08:41:53 Success: host-secrets server gave clientId project/releng/host-secrets/host/com.mozilla.scl3.releng.test.moonshot-test3...
2017/04/26 08:42:14 http.Get -> %s <nil>
2017/04/26 08:42:14 mime.PMT -> %s (not returned) <nil>
2017/04/26 08:42:14 os.Create(%s) -> %s /home/cltbld/test-macosx.sh <nil>
2017/04/26 08:42:14 io.Copy() -> %s <nil>
INFO[0021] Stopping worker after task as instructed by task.payload.reboot = 'always'  plugin=reboot prefix=plugin.reboot runId=0 taskId=FqhMboyjRhyBO4Ev8mdhdw
ERRO[0024] Recovered from panic:
 runtime error: invalid memory address or nil pointer dereference  incidentId=6a204096-8604-42b7-8ce2-aaf3c4fc2ca5 panic=runtime error: invalid memory address or nil pointer dereference prefix=worker.taskrun runId=0 stage=dispose taskId=FqhMboyjRhyBO4Ev8mdhdw
ERRO[0025] fatal error from TaskRun.Dispose() stopping now  prefix=worker runId=0 taskId=FqhMboyjRhyBO4Ev8mdhdw
Worker successfully started
Ah, I see -- there are two "Error downloading" messages, one potentially nested inside the other.

    if b.payload.Context != "" {
        if err = fetchContext(b.payload.Context, user); err != nil {
            err = runtime.NewMalformedPayloadError(
                fmt.Sprintf("Error downloading %s: %v", b.payload.Context, err),
            )   
            b.context.LogError(err)
            return
        }   
    }   

...

func fetchContext(context string, user *system.User) error {
    // TODO: use future cache subsystem, when we have it
    // TODO: use the soon to be merged fetcher subsystem
    filename, err := util.Download(context, user.Home())
    if err != nil {
        return fmt.Errorf("Error downloading '%s': %v", context, err)
    }   

    // TODO: verify if this will harm Windows
    // TODO: abstract this away in system package
    if err = os.Chmod(filename, 0700); err != nil {
        return fmt.Errorf("Error setting file '%s' permissions: %v", filename, err)
    }   

    if err = system.ChangeOwner(filename, user); err != nil {
        return err 
    }   
...

So it must be from that bare "return err".  If it was the `user.Home()` call above, the error would be "Error downloading ..: Error downloading ..: .."

Anyway, for the native engine, I don't think we need to chown things?
based on the build flags in https://github.com/golang/go/blob/go1.8.1/src/os/user/lookup_stubs.go it looks like these are used whenever CGO_ENABLED=0, which is always for linux and windows in tc-worker.

It seems the best course is just to change that, so..
  https://github.com/taskcluster/taskcluster-worker/pull/256
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(jopsen)
Resolution: --- → FIXED
Component: Worker → Workers
You need to log in before you can comment on or make changes to this bug.