Closed Bug 1624602 Opened 5 years ago Closed 5 years ago

Intermittent failure in TestFakeGenericWorker

Categories

(Taskcluster :: Workers, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

2020/03/24 13:59:56 Writing files
2020/03/24 13:59:56 Starting worker
2020/03/24 13:59:56 error reading from protocol: read |0: file already closed
--- FAIL: TestFakeGenericWorker (0.79s)
    run_test.go:73: 
        	Error Trace:	run_test.go:73
        	Error:      	Not equal: 
        	            	expected: []map[string]interface {}{map[string]interface {}{"conversationLevel":"low", "textPayload":"workin hard or hardly workin, amirite?"}}
        	            	actual  : []map[string]interface {}(nil)
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,7 +1,2 @@
        	            	-([]map[string]interface {}) (len=1) {
        	            	- (map[string]interface {}) (len=2) {
        	            	-  (string) (len=17) "conversationLevel": (string) (len=3) "low",
        	            	-  (string) (len=11) "textPayload": (string) (len=38) "workin hard or hardly workin, amirite?"
        	            	- }
        	            	-}
        	            	+([]map[string]interface {}) <nil>
        	            	 
        	Test:       	TestFakeGenericWorker

I suspect this is the worker shutting down before its buffer is flushed?

I accidentally left this single test (-run TestFakeGenericWorker) running in a tight loop for hours, and it didn't fail, so I can say confidently that I can't reproduce this locally.

This looks very familiar! Basically, it looks like cmd.Wait closes the read end of the connection out from under the protocol.

https://golang.org/pkg/os/exec/#Cmd.StdoutPipe

StdoutPipe returns a pipe that will be connected to the command's standard output when the command starts.

Wait will close the pipe after seeing the command exit, so most callers need not close the pipe themselves. It is thus incorrect to call Wait before all reads from the pipe have completed. For the same reason, it is incorrect to call Run when using StdoutPipe. See the example for idiomatic usage.

https://community-tc.services.mozilla.com/tasks/LeXK7Q65QXqmjqfCMo_ZXg/runs/0/logs/https%3A%2F%2Fcommunity-tc.services.mozilla.com%2Fapi%2Fqueue%2Fv1%2Ftask%2FLeXK7Q65QXqmjqfCMo_ZXg%2Fruns%2F0%2Fartifacts%2Fpublic%2Flogs%2Flive.log

=== RUN   TestFakeGenericWorker
2020/04/04 17:27:03 Loading worker-runner configuration from /tmp/dir062749099/runner.yaml
2020/04/04 17:27:03 Configuring with provider standalone
2020/04/04 17:27:03 Identified as worker wg/wi
2020/04/04 17:27:03 Configuring for worker implementation generic-worker
2020/04/04 17:27:03 provider metadata local-ipv4 not available; not setting config privateIP
2020/04/04 17:27:03 provider metadata instance-type not available; not setting config instanceType
2020/04/04 17:27:03 provider metadata instance-id not available; not setting config instanceID
2020/04/04 17:27:03 provider metadata region not available; not setting config region
2020/04/04 17:27:03 provider metadata availability-zone not available; not setting config availabilityZone
2020/04/04 17:27:03 provider metadata public-ipv4 not available; not setting config publicIp
2020/04/04 17:27:03 Writing files
2020/04/04 17:27:03 Starting worker
2020/04/04 17:27:03 error reading from protocol: read |0: file already closed
==================
WARNING: DATA RACE
Write at 0x000000d95bc0 by goroutine 8:
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/runner.teardownLogging()
      /go/taskcluster/tools/worker-runner/runner/run_test.go:35 +0x66
  runtime.call32()
      /usr/local/go/src/runtime/asm_amd64.s:539 +0x3a
  testing.(*T).FailNow()
      <autogenerated>:1 +0x43
  github.com/stretchr/testify/require.Equal()
      /go/pkg/mod/github.com/stretchr/testify@v1.5.1/require/require.go:165 +0x122
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/runner.TestFakeGenericWorker()
      /go/taskcluster/tools/worker-runner/runner/run_test.go:73 +0x7ce
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199
Previous read at 0x000000d95bc0 by goroutine 10:
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/logging/protocol.SetProtocol.func1()
      /go/taskcluster/tools/worker-runner/logging/protocol/proto.go:14 +0xb7
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/protocol.(*Protocol).recvLoop()
      /go/taskcluster/tools/worker-runner/protocol/protocol.go:150 +0x84
Goroutine 8 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:960 +0x651
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:1202 +0xa6
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199
  testing.runTests()
      /usr/local/go/src/testing/testing.go:1200 +0x521
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1117 +0x2ff
  main.main()
      _testmain.go:48 +0x223
Goroutine 10 (finished) created at:
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/protocol.(*Protocol).Start()
      /go/taskcluster/tools/worker-runner/protocol/protocol.go:94 +0x29a
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/runner.Run()
      /go/taskcluster/tools/worker-runner/runner/run.go:191 +0xa87
  github.com/taskcluster/taskcluster/v28/tools/worker-runner/runner.TestFakeGenericWorker()
      /go/taskcluster/tools/worker-runner/runner/run_test.go:70 +0x56e
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:909 +0x199
==================
--- FAIL: TestFakeGenericWorker (0.55s)
    run_test.go:73: 
        	Error Trace:	run_test.go:73
        	Error:      	Not equal: 
        	            	expected: []map[string]interface {}{map[string]interface {}{"conversationLevel":"low", "textPayload":"workin hard or hardly workin, amirite?"}}
        	            	actual  : []map[string]interface {}(nil)
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,7 +1,2 @@
        	            	-([]map[string]interface {}) (len=1) {
        	            	- (map[string]interface {}) (len=2) {
        	            	-  (string) (len=17) "conversationLevel": (string) (len=3) "low",
        	            	-  (string) (len=11) "textPayload": (string) (len=38) "workin hard or hardly workin, amirite?"
        	            	- }
        	            	-}
        	            	+([]map[string]interface {}) <nil>
        	            	 
        	Test:       	TestFakeGenericWorker
    testing.go:853: race detected during execution of test

Hey go-lingual friends, plz 2 review PR?

Flags: needinfo?(wcosta)
Flags: needinfo?(pmoore)
Flags: needinfo?(miles)

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #7)

Hey go-lingual friends, plz 2 review PR?

r+

Flags: needinfo?(wcosta)
Flags: needinfo?(pmoore)
Flags: needinfo?(miles)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.