Symptoms
At work we have OSX 10.7.3 installed and every once in a while I will see the following behaviors:
If the the screen is locked, then multiple tries of the same user/pass are not accepted.
If the screen is unlocked, then opening a new bash term may yield prompts such as:
`I have no name$`
or
lkyrala$ ssh lkyrala@ah-lkyrala2u
You don't exist, go away!
Even when our Macs are working normally, everyone here has to log in twice. The first time after boot always fails, but the second time (with the same password, not changing anything, just pressing enter again) succeeds. Weird?
Workarounds
There are some workarounds that resolve the immediate problem, but don't prevent it from happening again:
wait (maybe an hour or two) and the problems sometimes go away by themselves.
kill 'opendirectoryd' and let it restart. (from Apple Support Communities: User ID (not data) deleted suddenly?)
hold the power button to reset the computer
UPDATE 10/4/2012
Our net admins suspect that lockd is implicated. lockd apparently uses UDP and when the network is congested, packets are lost, which results in the hanging behavior. They are looking at steps to decrease the congestion. If the file access in question happens to be the Active Directory authentication handle, then all of these different pieces start to fit together.
Discussion
Now, the evidence above points me to something screwy with opendirectory and login credentials. Some other people report having these login problems, but it's hard to determine where the actual problem is (Mac, or network environment?).
I should add that most of the network are Windows machines, but we have quite a few Macs and Linux machines as well, but I'm not sure of the details of how the network auth is mapped from various domains to others... all I know is that our network credentials work in Windows domains as well as mac and linux logins -- so something is connecting separate systems, or using the same global auth system.
Additional Detail
Unfortunately, I didn't set up this Mac, our IT dept did, so I'm not entirely sure how authentication works. I do know that it is a network login (which is unusual in my experience with Macs, they usually have local accounts which connect to external resources) but here, our home folder is on the network, not local. Under my linux installs, connecting to the network involves yp/NIS, (which allows us to automount parts of our network filesystem from any machine), and the opendirectoryd.log seems to confirm this is involved...
/var/log/opendirectoryd.log*
shows:
2012-04-04 01:29:12.370 EDT - ddddd.dddddd.dddddd.dddddd - Client: automount, UID: 0, EUID: 0, GID: 0, EGID: 0
2012-04-04 01:29:12.370 EDT - ddddd.dddddd.dddddd.dddddd, Node: /NIS/Domain, Module: nis - could not determine map for rectype 'mounts' attribute 'byname'
2012-04-04 01:32:04.504 EDT - failed to get YP map list
It looks like the domain 'Domain' is being lost somehow. Why is the UID == 0 here? That seems bad, doesn't it?
I know under Linux a while back, I discovered that the NIS broadcast had been disabled or blocked, so I gathered the IPs from someone and set the ypserver IPs manually in /etc/yp.conf
and that fixed drops in Linux. Maybe something similar is going on here?
I tried looking up information in Mac's yp man pages:
And then found this post detailing where the existing servers are set:
However, checking the ypserver settings showed that both server IPs were correctly set for NIS.
Checking /var/log/system.log
shows:
Aug 28 00:30:08 mymac ypbind[22991]: direct: sendto: No route to host
Aug 28 00:30:08 mymac ypbind[22991]: direct: sendto: No route to host
Aug 28 00:30:08 mymac ypbind[22991]: Can't contact any servers listed in /var/yp/binding/Domain.ypservers. Aborting
Aug 28 00:30:08 mymac com.apple.launchd[1] (com.apple.nis.ypbind[22991]): Exited with code: 1
Aug 28 00:30:08 mymac com.apple.launchd[1] (com.apple.nis.ypbind): Throttling respawn: Will start in 10 seconds
Aug 28 00:30:08 mymac xpchelper[22990]: getpwuid_r() failed for UID: uuuu, ret: 0, errno: 0
So this makes me suspect the nfs.conf settings, etc. Some others believe that this is due to something in lockd.
Comments
Post a Comment