Closed Bug 656450 Opened 13 years ago Closed 13 years ago

gethostbyaddr for your own address doesn't work on w7

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(1 file)

Leave it to Microsoft to screw up every last thing. >>> import socket >>> socket.gethostname() 'talos-r3-w7-003' >>> socket.gethostbyaddr('10.12.50.164') ('talos-r3-w7-003.build.mozilla.org', [], ['10.12.50.164']) when in fact, that record should be build.scl1. It works fine for that IP from another host: >>> import socket >>> socket.gethostname() 'talos-r3-w7-002' >>> socket.gethostbyaddr('10.12.50.164') ('talos-r3-w7-003.build.scl1.mozilla.com', [], ['10.12.50.164']) but of course, that host will get its own address wrong: >>> socket.gethostbyaddr('10.12.50.163') ('talos-r3-w7-002.build.mozilla.org', [], ['10.12.50.163']) This blocks runslave.py's ability to figure out the proper hostname to use to report to NSCA.
Blocks: 629692
Same problem on XP: >>> socket.gethostname() 'talos-r3-xp-003' >>> socket.gethostbyname('talos-r3-xp-003') '10.12.50.111' >>> socket.gethostbyaddr('10.12.50.111') ('talos-r3-xp-003.build.mozilla.org', [], ['10.12.50.111'])
Blocks: 656441
Blocks: 656175
So I change the "Primary DNS Suffix" of talos-r3-w7-003 to "build.scl1.mozilla.com", and it came up with the right name. So I think we need to set these suffixes correctly everywhere. I'm not sure what mayhem that will cause.
I suppose another option is to embed a basic DNS resolver in runslave.py.
Assignee: nobody → dustin
catlee points out that changing the primary DNS suffix will likely make OPSI fail. I'm not sure how hard it would be to get OPSI on its feet again after that, but nobody here knows much about OPSI, so I expect that would be "hard".
http://www.dnspython.org/ looks decent, but I don't have a good way to install an entire Python package on Windows - I need something that will fit comfortably in runslave.py itself.
why not just use nslookup to find that info and parse it's output? that way we avoid having to hard code too much and also can leave the net config opsi frienldy
That's a good idea. Another decent idea is to use the IP address, since that's much more reliably fetched than the fqdn.
The IP address won't be a reliable indicator when we do layer 2 connectivity, though (repoman1, for example, is in scl2 but is in the same address space as sjc1).
(In reply to comment #4) > catlee points out that changing the primary DNS suffix will likely make OPSI > fail. I'm not sure how hard it would be to get OPSI on its feet again after > that, but nobody here knows much about OPSI, so I expect that would be > "hard". I suspect the failure mode here will be mismatched keys when the host uses a new FQDN to connect to OPSI. That's a recoverable state, probably by reinstalling preloginloader per slave, but not ideal/trivial.
This may be fixable without reinstalling the preloginloader by changing the following: * /etc/opsi/pckeys * /var/lib/opsi/config/clients filenames * Possibly changing something in the "locked.cfg" file on the slaves (located somewhere in program files/opsi.org) I make no promises or guarantees, however!
I'm not keen on tangling with OPSI, so I think bear's suggestion is looking best right now. Basically, shell out the DNS lookup to nslookup.exe, which actually does a DNS query instead of asking the windows resolver.
Bear, since it's your idea, it's yours to review. This tries the nslookup trick on Windows, falling back to the results of gethostbyaddr if that doesn't work. It's an ugly hack, but hopefully sufficiently well-isolated to not be that ugly. I tested this on XP and W764, and it worked on both.
Attachment #532035 - Flags: review?(bear)
Comment on attachment 532035 [details] [diff] [review] m656450-puppet-manifests-r1.patch looks good
Attachment #532035 - Flags: review?(bear) → review+
Comment on attachment 532035 [details] [diff] [review] m656450-puppet-manifests-r1.patch changeset: 336:888ff8693f52 tag: tip user: Dustin J. Mitchell <dustin@mozilla.com> date: Thu May 12 17:15:12 2011 -0500 summary: Bug 656450: workaround broken gethostbyaddr on windows; r=bear Deployed everywhere puppet can reach.
Attachment #532035 - Flags: checked-in+
Seems functional. Now to redeploy!
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: