Closed Bug 1225116 Opened 9 years ago Closed 8 years ago

Artifacts can be uploaded twice without error

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: pmoore, Unassigned)

References

Details

I noticed this while writing generic worker tests. I had a bug in my code that log files were getting uploaded twice, which I noticed when the "artifact created" pulse message would arrive twice. Digging into the code I saw it was a genuine error that had gone unnoticed because the queue did not report a failure. Here is my log output Uploading artifact public/logs/task_log.txt Artifact seen via pulse: &{{"text/plain" "2015-12-16T14:18:05.328Z" "public/logs/task_log.txt" "s3"} '\x00' {"2015-11-17T14:18:05.328Z" "2015-12-16T14:18:05.328Z" "test-provisioner" '\x01' [{"scheduled" "" "0001-01-01T00:00:00.000Z" '\x00' "2015-11-16T14:18:06.822Z" "2015-11-16T14:18:07.610Z" "running" "2015-11-16T14:38:07.511Z" "test-worker-group" "test-worker-id"}] "test-scheduler" "running" "MjDm5d4jSySD-hfNjY05OQ" "cIsNmYXQTrCFe5fo8_KVgA" "TFzSBBqYQd6SQnrUi7qUCQ"} '\x01' "test-worker-group" "test-worker-id"} Uploading artifact public/logs/task_log.txt Artifact seen via pulse: &{{"text/plain" "2015-12-16T14:18:05.328Z" "public/logs/task_log.txt" "s3"} '\x00' {"2015-11-17T14:18:05.328Z" "2015-12-16T14:18:05.328Z" "test-provisioner" '\x01' [{"scheduled" "" "0001-01-01T00:00:00.000Z" '\x00' "2015-11-16T14:18:06.822Z" "2015-11-16T14:18:07.610Z" "running" "2015-11-16T14:38:07.511Z" "test-worker-group" "test-worker-id"}] "test-scheduler" "running" "MjDm5d4jSySD-hfNjY05OQ" "cIsNmYXQTrCFe5fo8_KVgA" "TFzSBBqYQd6SQnrUi7qUCQ"} '\x01' "test-worker-group" "test-worker-id"}
This is expected behavior. Pulse uses at-least once delivery semantics. And you are allowed to retry calls.. Most of them are idempotent. That is the case here.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
The pulse messages above were for two separate uploads, rather than pulse delivering the same message twice. The two artifact upload operations in this case were run in serial fashion - first there was an upload, then the message arrived on pulse, then there was a second upload, and then there was a second message arriving on pulse. After the first upload had completed successfully, I believe we shouldn't allow another upload to occur, with potentially different content for the artifact. If the first upload fails, I believe we should allow a second upload to take place - this is not what happened in this case though. My understanding is that artifacts are immutable - and the current queue behaviour does not enforce this.
Status: RESOLVED → REOPENED
Flags: needinfo?(jopsen)
Resolution: INVALID → ---
(in other words, the idempotency relates to a failed transaction being retried, not allowing a retry after a successful transaction, thereby allowing an immutable artifact to be changed)
.... thinking about this some more, we just get back the s3 url, so taskcluster doesn't track if the upload took place or not, therefore it does indeed seem sane with the current limitations, that the queue reissues the same s3 url, even if the s3 upload completed and it didnd't know about it. Unfortunately the queue also doesn't (yet) track artifact hash or content length. When bug 1155645 is implemented, the queue *will* know whether the artifact was already successfully published or not, and then it can prevent subsequent uploads of the same file. At the moment though, the queue doesn't have that information, so we have to live with this limitation. When bug 1190016 is implemented, the queue will be able to reject mutated artifacts. At the moment it doesn't know if the content is being mutated. Rather than resolving this bug as invalid, I'm marking it as dependent on the other two bugs, since at the moment it *is* possible for a worker to upload an artifact, the upload to complete, and then it to upload different content for the same artifact, and the queue not to prevent it. However, when these bugs land, bug 1190016 will prevent mutations and bug 1155645 will prevent reuploading of a successfully uploaded artifact.
Depends on: 1155645, 1190016
> the idempotency relates to a failed transaction being retried Nope, f : X -> Y is idempotent if it holds that f(x) = f(x) As many functions have side-effects this is not a given in distributed systems, but an extremely useful property... Uploading twice is allowed as long as you upload the same thing. Yes, that will get better once queue tracks the hash :) For now it's an approximate. That is true. Note: in the future it'll still be possible to upload twice, as long as you upload the same thing.
Flags: needinfo?(jopsen)
OK, you make a fair point =) Removing bug 1155645 as a dependency, since it is intended to allow uploading of an identical artifact. Thanks Jonas.
No longer depends on: 1155645
See Also: → 1155645
I don't think this was ever valid...
Status: REOPENED → RESOLVED
Closed: 9 years ago8 years ago
Resolution: --- → INVALID
Component: Queue → Services
You need to log in before you can comment on or make changes to this bug.