Closed Bug 1132214 Opened 9 years ago Closed 9 years ago

Allow task definition to explicitly state the location of the log

Categories

(Taskcluster :: Workers, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: garndt, Assigned: jlorenzo)

References

Details

Attachments

(1 file, 1 obsolete file)

If a task declares that it should take measures to be secret, then live logging feature should be disabled and the bulk log artifact should be created in a private bucket.
Assignee: nobody → garndt
Or live log should be placed under "private/logs/terminal.log"
And then this should be implemented along with bug 1132221.

Granted the livelog would still be served over HTTP and not HTTPS. But it'll only be served when requested, so it's not likely to be requested.
Anyways, just a suggestion.
Component: TaskCluster → Docker-Worker
Product: Testing → Taskcluster
Summary: [docker-worker] Live logging should be disabled for tasks enabling the "secret" feature in the task payload → Allow task logs to be uploaded under a private artifact path
Assignee: garndt → nobody
is chunk 29 from here intermitten and not caused by your changes? https://treeherder.mozilla.org/#/jobs?repo=try&revision=d2ba56aa3467
sorry, wrong bug!
The option 'log' under task.payload will be added to allow the creator to specify a log location path.

Currently logs are stored under a public path, which is less than ideal for more sensitive tasks.  We will still default to using the path if not defined, but allow the path to be overwritten.
Summary: Allow task logs to be uploaded under a private artifact path → Allow task definition to explicitly state the location of the log
Having private logs is needed to get the parity with Jenkins (bug 1225457).

Discussed with :garndt, I'll give it a shot.
Assignee: nobody → jlorenzo
Blocks: 1225457
Status: NEW → ASSIGNED
Attached file docker-worker PR (obsolete) —
Depends on: 1228284
Depends on: 1228583
Depends on: 1228585
Like said in bug 1228583 comment 7, I ended up using the dummy PR, as it's the only one that has CI on it.

Since yesterday (on IRC), I've been able to run the tests locally. I don't have any proven explanation. The only thing I tried was to run the tests on my Mac and then the tests on my Linux worked. 

This enabled me to fix the remaining issues in my code. I ran local_live_log_test.js locally => the 4 tests passed.

Manually, I verified the task created the logs at the correct path[1]. However, I've not been able to download them, to verify their content. I used [2] to get a bewit link with the credentials used for this project. The error I get is:
> {
>   "message": "Authorization Failed",
>   "error": {
>     "info": "None of the scope-sets was satisfied",
>     "scopesets": [[
>         "queue:get-artifact:private/docker-worker-tests/logs/live_backing.log"
>     ]],
>     "scopes": []
>   }
> }

Greg, does "get-artifact:private/docker-worker-tests/*" enable one to download in sub-directories?

Finally, 2 tests are failing in CI[3]:
1) device linking within container @disabled link testdroid device => it was already failing at[4]
2) Docker custom private registry => I couldn't run it as I'm mising this scope:
> ["queue:create-task:no-provisioning-nope/*"]

[1] https://tools.taskcluster.net/task-inspector/#R-uAYSAoQ7KeOA1gZZv4tw/0
[2] https://github.com/askeing/taskcluster-util-python
[3] https://tools.taskcluster.net/task-graph-inspector/#D152ExS6RraAZ9Tvtn8-Ig/
[4] https://tools.taskcluster.net/task-graph-inspector/#Y7NOLqdMRVeR4tjNAP9WNw/
Attachment #8692457 - Attachment is obsolete: true
Attachment #8694692 - Flags: review?(garndt)
Blocks: 1230134
Comment on attachment 8694692 [details] [review]
docker-worker PR (was the dummy PR)

Looks good, left some comments in the PR
Attachment #8694692 - Flags: review?(garndt) → review+
I fixed the nits pointed out. Tested locally and [1] show errors that don't seem related to the patch. Do you think we're good to merge the PR, Greg?

[1] https://tools.taskcluster.net/task-graph-inspector/#y8hhr4gzRu2q7BCgufCqyg
Flags: needinfo?(garndt)
If you rebase from master, it should fix the "app" test failure in chunk 2.  all will continue to fail and has nothing to do with your tests.
Flags: needinfo?(garndt)
It seems we're all set. I don't have the rights to merge the PR. Would you mind handling it?
Flags: needinfo?(garndt)
Keywords: checkin-needed
I can after I check why you're getting an error with one of the tests that is not present elsewhere.
(In reply to Greg Arndt [:garndt] from comment #13)
> I can after I check why you're getting an error with one of the tests that
> is not present elsewhere.

hi garndt, i could push this PR but i have to wait till your check is complete or ?
(In reply to Carsten Book [:Tomcat] from comment #14)
> (In reply to Greg Arndt [:garndt] from comment #13)
> > I can after I check why you're getting an error with one of the tests that
> > is not present elsewhere.
> 
> hi garndt, i could push this PR but i have to wait till your check is
> complete or ?

This is a github PR for a project I work on, so there isn't a need for a sheriff to merge anything, but thanks for following up.  I think :jlorenzo can just remove the checkin-needed flag.

Thanks again :)
Keywords: checkin-needed
Ok, I found the issue with the one test failing after these commits.  R+ carries forward assuming this gets fixed and tests (other than the "all" chunk) are green.  I can merge once that happens.
Flags: needinfo?(garndt)
Thanks for the help! I pushed the one-line-change you proposed and ran the tests 3 times

There is one remaining consistent issue:
> pull image ensure public indexed image can be pulled:
>   Error: HTTP code is 409 which indicates error: conflict - Conflict, cannot delete 2cc966a5578a because
>   the container 12223698de6b is using it, use -f to force

It looks related to the environment, not the code change, right?

[1] https://tools.taskcluster.net/task-graph-inspector/#RNKVQOPQRRO7pmh9VESZVQ/8v6GPGuJTu-DQnEiu9r5Ww/0
https://tools.taskcluster.net/task-graph-inspector/#6K4dIS-NRru9Cbfec3HQQQ/ptm9_cEURhG5i4BJnWTLcg/0
https://tools.taskcluster.net/task-graph-inspector/#JkQJSZf1QIisn8-OgoYEwA/_0Ye6vWYSCa4sfzDq64pOQ/0
(In reply to Johan Lorenzo [:jlorenzo] (QA) from comment #17)
Okay, I rerun the jobs one more time[1] a few hours ago and got the same error. I now doubt it's just the environment. I'll look into that.

[1] https://tools.taskcluster.net/task-graph-inspector/#j2luDnOfSxCmWjvYPFr_uA/TAy0EsxZTdiRmFZMdjFRjQ/0
I'm going to run locally, but I'm not confident it's related to the code changes.  Your change doesn't touch anything that would affect that from what I can see.
I'm not sure why it just started failing, but I can duplicate that error when running just that one test against master.  I don't think it's related to this PR.  If you want, I can always create images based on this PR (without merging to master first) and deploy it to the worker types used for device testing if you want to kick the tires on it. I'm on PTO next week so I want to do as much as possible to get you unblocked and getting things going for you.
Thanks for the test.

Merged in master at: https://github.com/taskcluster/docker-worker/commit/5e88ee1953f37dc2ce58e44471ee803bab025c7b
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Blocks: 1234508
Blocks: 1234516
Component: Docker-Worker → Workers
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: