Closed Bug 564565 Opened 14 years ago Closed 14 years ago

new talos build slaves cannot mount /N

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86
macOS
task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jhford, Assigned: aravind)

References

Details

This is causing puppet to fail

I get the following output when i try to mount manually:

bash-3.2# mount -t nfs 10.2.71.136:/export/buildlogs/puppet-files /N
Cannot MNT RPC: RPC: Remote system error - Operation timed out
Cannot MNT RPC: RPC: Remote system error - Operation timed out
mount_nfs: can't access /export/buildlogs/puppet-files: Permission denied

Apparently, these slaves are in a new range of IPs, 10.250.51.XXX.  Can we make sure that these slaves are able to mount the nfs share?
This is blocking the new talos slaves from being able to run in production.  We need to reboot machines talos-r3-snow-{021...050}, talos-r3-leopard-{041...050} when this is resolved.
Severity: normal → blocker
Can you give me a machine to test this from?

A source linux machine, the ip address, location, what you are trying to mount.. etc.
Assignee: server-ops → aravind
It looks like you are trying to mount a nfs store in mpt from a machine in the castro office.  Has this ever worked in the past?  If not, this will probably need all kinds of ACLs, firewall redirects etc.   Copying Derek for his input.

Feel free to correct me if my assumptions above are wrong.
10.2.71.136:/export/buildlogs is exported read-only to the world

I remember an issue with this in the past, and I believe it was related to trying to mount the partition read-write?

I can't log in to any of the new machines to test, but this likely isn't a network or NFS export permission problem.
(fixing dependency)

(In reply to comment #4)
> 10.2.71.136:/export/buildlogs is exported read-only to the world
> 
> I remember an issue with this in the past, and I believe it was related to
> trying to mount the partition read-write?
read-only should be good enough - they only want to read puppet manifests, aiui. 

These new machines were all imaged from the same ref images as we used for previous batch of minis, so I would have expected them to mount that NFS drive with same read-only permissions as the previous minis.

> I can't log in to any of the new machines to test, but this likely isn't a
> network or NFS export permission problem.
talos-r3-snow-{021...050}.build.m.o are all setup with the same usual RelEng passwords, and all show this problem.
Blocks: 557294
Which host is this failing on? I just successfully mounted the share on talos-r3-snow-031:


bash-3.2# hostname
talos-r3-snow-031.build.mozilla.org
bash-3.2# mount 10.2.71.136:/export/buildlogs/puppet-files /N
bash-3.2# ls /N
CVS			darwin9			shared
centos5			debuginfo-packages	talos
darwin-shared		dist
darwin10		mercurial
I can't seem to reach 21-30, maybe that's related?
(In reply to comment #7)
> I can't seem to reach 21-30, maybe that's related?

Neither can nagios, I'd guess they haven't been imaged or racked yet.

(In replay to earlier comments)

When I was working with talos-r3-snow-{031...050} and talos-r3-leopard-{041...050} I was noticed 
* required 'sudo mount /N' on ssh session
* /N was not reliably in place after boot, appeared to turn up some minutes after boot, and possibly dropped again. I need to look again to sort out exactly what's happening on one box, rather than flit around several
It looks like this problem is not there anymore and the slaves are able to mount the /N directory now.

Slave 021 was able to sync up with puppet properly
Looks like everything is well with the world here?  Please re-open if its still on fire.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
it was hard to confirm, with all the other network woes going on at same time. However, this looks like it is now all resolved.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.