Closed
Bug 1468084
Opened 6 years ago
Closed 6 years ago
builds/scriptworker returned 1 instead of one of [0]
Categories
(Infrastructure & Operations :: RelOps: Puppet, task)
Infrastructure & Operations
RelOps: Puppet
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: apop, Assigned: tomprince)
References
Details
Today, while daily monitoring, I've received some mails from puppet with the following problem : Sun Jun 10 08:29:03 -0700 2018 Puppet (err): /tools/python3/bin/python -BE /tools/misc-python3/virtualenv.py --python=/tools/python3/bin/python --distribute --never-download /builds/scriptworker returned 1 instead of one of [0] Sun Jun 10 08:29:03 -0700 2018 /Stage[main]/Bouncer_scriptworker/Python3::Virtualenv[/builds/scriptworker]/Exec[virtualenv /builds/scriptworker]/returns (err): change from notrun to 0 failed: /tools/python3/bin/python -BE /tools/misc-python3/virtualenv.py --python=/tools/python3/bin/python --distribute --never-download /builds/scriptworker returned 1 instead of one of [0] Can you please check or point me to someone who could help resolving this ?
Comment 1•6 years ago
|
||
Puppet is a relops tool, and Aki knows about scriptworker.
Assignee: nobody → relops
Component: Worker → RelOps: Puppet
Product: Taskcluster → Infrastructure & Operations
QA Contact: pmoore → mcornmesser
Comment 2•6 years ago
|
||
Looks like this is for tb-bouncer. Tom, do you know what this is about? Want a hand?
Flags: needinfo?(mozilla)
Assignee | ||
Comment 3•6 years ago
|
||
It looks like, on tb-bouncer, puppet thinks `/tools` should have mode 0700 which means that cltbld can't access it. I've not been able to track down *why*. It might be related to https://github.com/mozilla/build-puppet/blob/af266054f23b38df26aa5c7f965ebcccbc9b5415/modules/bouncer_scriptworker/manifests/init.pp#L67-L73 but that doesn't explain why it is only the tb-* one that is hitting that.
Comment 4•6 years ago
|
||
Hm, I wonder if we want that block at all.
Comment 5•6 years ago
|
||
We're continuing to get emailed about this multiple times per hour -- any timeframe in mind? We can remove that block, explicitly set perms, make sure the user is cltbld instead of root, chmod it manually and see if it sticks, or other fixes.
Assignee | ||
Comment 6•6 years ago
|
||
I've pinned these workers back to my envionment for the moment, which has a fix, but I'm not sure that it one we should land.
Assignee | ||
Comment 7•6 years ago
|
||
I've got tb-bouncer and tb-bouncer-dev pinned to my environment. The only patch there is one that hard-codes /tools and and the misc python dir to have mode 0755. As soon as I drop that change from my environment, puppet tries to switch /tools to 0700. I think this is because the resource defaults that bouncer sets, and some aspect of dynamic scoping of resource defaults. But I am baffled by why this only affects the tb-bouncer workers, since it looks like their configuration is identical to the firefox ones. So I am reluctant to just get rid of the default (particularly since there appear to be many scriptworkers using them.
Flags: needinfo?(mozilla)
Comment 8•6 years ago
|
||
https://github.com/mozilla-releng/build-puppet/blob/af266054f23b38df26aa5c7f965ebcccbc9b5415/modules/bouncer_scriptworker/manifests/init.pp#L67-L73 is only used by bouncer scriptworkers. I think it's a bad default for tools.
Assignee | ||
Comment 9•6 years ago
|
||
There are identical stanza's in other scriptworkers: https://github.com/mozilla-releng/build-puppet/blob/af266054f23b38df26aa5c7f965ebcccbc9b5415/modules/shipit_scriptworker/manifests/init.pp#L59-L65 for example.
Comment 10•6 years ago
|
||
Shipit scriptworker doesn't have a tools clone. I'm betting fx bouncer scriptworker had someone manually clone tools or manually fix its perms.
Comment 11•6 years ago
|
||
Johan, did you write this? Did you intend for tools to have a 0700 perm?
Flags: needinfo?(jlorenzo)
Assignee | ||
Comment 12•6 years ago
|
||
It isn't a clone of build-tools, it is the toplevel directory where python gets installed.
Comment 13•6 years ago
|
||
(In reply to Tom Prince [:tomprince] from comment #12) > It isn't a clone of build-tools, it is the toplevel directory where python > gets installed. Ah, /tools . I think the block is doing more than it should in both locations, and we should probably explicitly list the files we want to have 0700.
Comment 14•6 years ago
|
||
(In reply to Aki Sasaki [:aki] from comment #11) > Johan, did you write this? Did you intend for tools to have a 0700 perm? I confirm I wrote this. I did not intend for tools to have this set of permissions. I originally copied this file from what we have in other types of workers, like pushapk (that is to say, without tools). In there, my original intent was to make 0600 the default for files defined below (like script_config.json). I see this section[1] is actually redundant with this one[2]. It was changed in [3], 2 days before this bug got filed. I don't think [3] is the root cause, though. As far as I know "File" only applies to "file" entries defined within the same package. Moreover, the python virtual env is defined to be 0700 at [4]. /tools is defined there[5]. There are a couple of things I don't understand: a. Why doesn't this affect other types of scriptworker instance? Like Tom said, other scriptworker instances are configured the same way. For instance pushapk has had the same config for more than 22 months[6] b. What does the command in comment 0 try to do? Do they try to create the virtualenv or do they try to install packages in this venv? Do you guys have fuller logs about the 755 error? [1] https://github.com/mozilla-releng/build-puppet/blob/af266054f23b38df26aa5c7f965ebcccbc9b5415/modules/bouncer_scriptworker/manifests/init.pp#L67-L73 [2] https://github.com/mozilla-releng/build-puppet/blob/af266054f23b38df26aa5c7f965ebcccbc9b5415/modules/bouncer_scriptworker/manifests/init.pp#L79-L83 [3] https://github.com/mozilla-releng/build-puppet/pull/48 [4] https://github.com/mozilla-releng/build-puppet/blob/af266054f23b38df26aa5c7f965ebcccbc9b5415/modules/bouncer_scriptworker/manifests/init.pp#L35 [5] https://github.com/mozilla-releng/build-puppet/blob/master/modules/dirs/manifests/tools.pp#L15 [6] https://github.com/mozilla-releng/build-puppet/blame/2832467b9bc9c37d21f21b832587bafec873e1e4/modules/pushapk_scriptworker/manifests/init.pp#L84
Flags: needinfo?(jlorenzo)
Comment 15•6 years ago
|
||
I'm guessing that the scriptworker instances that were pre-existing had their /tools created long enough ago that this isn't an issue. If the block is redundant and causing problems, let's get rid of it. If people want to spend more time debugging why, I'm ok with that, but let's stop the bleeding in prod and then debug why in a dev env.
Assignee | ||
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•