How To Fix Host Not Responding Error with VMware ESX, vSphere in vCenter
Posted on 05 Feb 2010 by Ray Heffer
VirtualCenter looses connectivity to an ESX or vSphere host, and all of the virtual machines that are running on the host show as ‘disconnected’. You will also see that the host has ‘not responding’ in brackets next to it’s name. This one is very simple to fix, as it is usually caused by the host agent service (mgmt-vmware) failing due to a dead process.
First, try and restart the mgmt-vmware service:
# service mgmt-vmware restart
If you find this is hanging when trying to restart the host agent, then you’ll need to kill off the process causing the issue. Open another console session and do the following:
# ps -ef | grep hostd
This will output a list of processes using hostd similar to the following:
root 23955 1 0 10:42 pts/1 00:00:00 /bin/sh /usr/bin/vmware-watchdog -s hostd -u 60 -q 5 -c /usr/sbin/vmware-hostd-support /usr/sbin/vmware-hostd -u root 23961 23955 4 10:42 ? 00:00:15 /usr/lib/vmware/hostd/vmware-hostd /etc/vmware/hostd/config.xml -u root 24211 23422 0 10:48 pts/1 00:00:00 grep hostd
If you look at the output carefully you’ll see that the first process is using the vmware-watchdog, this is fine, but the second line is using hostd (config.xml -u). This is the culprit, so lets kill the process. By the way, your virtual machines will continue to run so don’t worry about that.
# kill -9 23961
You’ll now find that the hostd service will start and after a few seconds your host and virtual machines will become available again in vCenter.