GCP docker-worker images don't have lz4
Categories
(Taskcluster :: Workers, defect)
Tracking
(Not tracked)
People
(Reporter: dustin, Assigned: miles)
References
Details
Nov 03 15:19:12 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker {"type":"ensure image","source":"top","provisionerId":"proj-servo","workerId":"714612925948711628","workerGroup":"community-tc-workers-google","workerType":"docker","workerNodeType":"projects/757942385826/machineTypes/n1-standard-4","image":{"type":"task-image","path":"public/image.tar.lz4","taskId":"D8pWPKUgSb-4XI_GGunu5g"}}
Nov 03 15:19:14 proj-servo-docker-oy5aes4orzebkdskvi9ayw systemd-networkd veth8c32a93: Gained IPv6LL
Nov 03 15:19:22 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker {"type":"aliveness check","source":"top","provisionerId":"proj-servo","workerId":"714612925948711628","workerGroup":"community-tc-workers-google","workerType":"docker","workerNodeType":"projects/757942385826/machineTypes/n1-standard-4","alive":true,"uptime":976,"interval":30000}
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker Uncaught Exception! Attempting to report to Sentry and crash.
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker Error: spawn lz4 ENOENT
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at Process.ChildProcess._handle.onexit (internal/child_process.js:190:19)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at onErrorNT (internal/child_process.js:362:16)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at _combinedTickCallback (internal/process/next_tick.js:139:11)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at process._tickDomainCallback (internal/process/next_tick.js:219:9)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker reportError - level: fatal, tags: {}
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker { Error: spawn lz4 ENOENT
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at Process.ChildProcess._handle.onexit (internal/child_process.js:190:19)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at onErrorNT (internal/child_process.js:362:16)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at _combinedTickCallback (internal/process/next_tick.js:139:11)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker at process._tickDomainCallback (internal/process/next_tick.js:219:9)
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker errno: 'ENOENT',
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker code: 'ENOENT',
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker syscall: 'spawn lz4',
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker path: 'lz4',
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker spawnargs:
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker [ '-d',
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker '/mnt/tmp-docker-images/TjBUZ1WHReW41i9BBVpcWQ/image.tar.lz4',
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker '/mnt/tmp-docker-images/TjBUZ1WHReW41i9BBVpcWQ/image.tar' ] }
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker Succesfully reported error to Sentry.
Nov 03 15:19:30 proj-servo-docker-oy5aes4orzebkdskvi9ayw docker-worker 2019/11/03 15:19:30 exit status 1
The image is docker-worker-gcp-googlecompute-2019-10-08t02-31-36z
. I think that was built with monopacker?
Reporter | ||
Comment 1•5 years ago
•
|
||
https://github.com/taskcluster/docker-worker/pull/524 addresses the "Uncaught Exception!" part, and should hopefully convert this from a wedged worker + claim-expired into a "regular" task failure.
It remains to build an image containing lz4.
Reporter | ||
Comment 2•5 years ago
|
||
==> Builds finished. The artifacts of successful builds are:
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"0", "builder-id", "packer.googlecompute"}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"0", "id", "docker-worker-gcp-googlecompute-2019-11-03t21-30-52z"}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"0", "string", "A disk image was created: docker-worker-gcp-googlecompute-2019-11-03t21-30-52z"}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"0", "files-count", "0"}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"0", "end"}
--> docker_worker_gcp: A disk image was created: docker-worker-gcp-googlecompute-2019-11-03t21-30-52z
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"1", "builder-id", ""}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"1", "id", ""}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"1", "string", ""}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"1", "files-count", "0"}
2019/11/03 21:40:00 machine readable: docker_worker_gcp,artifact []string{"1", "end"}
2019/11/03 21:40:00 [INFO] (telemetry) Finalizing.
--> docker_worker_gcp:
2019/11/03 21:40:00 waiting for all plugin processes to complete...
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 [ERR] Error decoding response stream 62: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 39: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 55: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 73: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 78: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 81: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 59: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 84: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 44: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 53: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 58: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 60: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 45: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 56: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 80: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 83: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 66: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 70: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 71: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 72: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 82: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 42: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 43: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 47: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 65: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 64: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 68: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 79: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 37: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 69: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 67: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 41: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 38: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 40: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 50: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 52: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 77: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 51: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 54: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 63: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 75: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 76: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 49: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 57: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 74: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 46: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 48: EOF
2019/11/03 21:40:00 [ERR] Error decoding response stream 61: EOF
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
2019/11/03 21:40:00 /usr/local/bin/packer: plugin process exited
so, I guess that's .. success??
Reporter | ||
Comment 3•5 years ago
|
||
I guess not.. instances created with that image just never do anything, not even logging. Halp?
Comment 4•5 years ago
|
||
For now as a work-around I’ve made Servo CI upload uncompressed images. The build
image is 1.5 GB uncompressed and takes ~25 seconds to download from GCP. It’s half that size with lz4, which takes ~5 seconds to download from AWS (on the taskcluster.net deployment).
Reporter | ||
Comment 5•5 years ago
|
||
Miles was able to build an image and is testing it.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 6•5 years ago
|
||
Miles, this is now deployed eveyrwhere, right?
Comment 7•5 years ago
|
||
As of this writing, https://community-tc.services.mozilla.com/tasks/ciL0gzGYSQSLT3RDZJFM6Q/runs/0 has been "running" for 20 minutes and does not have a log file. This is the same symptom as this bug, and I made that task specifically to try an lz4-compressed Docker image.
The worker uses GCP image projects/taskcluster-imaging/global/images/docker-worker-gcp-googlecompute-2019-11-04t22-31-35z
, copied from https://community-tc.services.mozilla.com/worker-manager/proj-servo%2Fdocker which comes from https://github.com/mozilla/community-tc-config/blob/9d6dce8d2463f7d61f22d389921a2c18f23b54f3/generate/workers.py#L67-L70
It looks like the fix is not deployed to that image at least.
Assignee | ||
Comment 8•5 years ago
|
||
Yes, the docker-worker-gcp-googlecompute-2019-11-04t22-31-35z
should have lz4, or at least have the liblz4-tool
package installed per https://github.com/taskcluster/monopacker/pull/27.
I'm not sure what the issue is, once we have logs I imagine we'll be able to better understand.
Comment 9•5 years ago
|
||
Oh sorry, it looks like I misdiagnosed. https://community-tc.services.mozilla.com/tasks/ciL0gzGYSQSLT3RDZJFM6Q/runs/0 completed successfully with a compressed Docker image. So I think this bug is fixed. But the log was inaccessible until the task ended, even though bug 1593520 is fixed and other docker-worker tasks do have live logs. I’ll comment there about that.
Description
•