VMware | ShocKNetworK Blog

Disconnect cdroms from the command line in VMs.

shows VMs vimsh -n -e “vmsvc/getallvms” | grep -i vmid | awk ‘{print $3}’ > /tmp/vmids vimsh -n -e “vmsvc/deviceconnection 3000 disconnect”
3000 here is usually ide0:0. for others, you’ll need to see it in
vimsh -n -e “vmsvc/showdevices ” Then, just put it into a for loop to disconnect all cdroms.]]>

"not enough licenses" from VC is deceiving

error: “not enough licenses” 2. The vpxa log file seemed to point to a vmID192 having problems with accessing datastore. [2007-06-21 16:19:25.485 ‘App’ 84306864 warning] ============BEGIN FAILED METHOD CALL DUMP============
[2007-06-21 16:19:25.485 ‘App’ 84306864 warning] Invoking [GetConfig] on [vim.VirtualMachine:192]
[2007-06-21 16:19:25.485 ‘App’ 84306864 warning] Fault has an empty message
[2007-06-21 16:19:25.485 ‘App’ 84306864 warning] ============END FAILED METHOD CALL DUMP============
[2007-06-21 16:19:25.485 ‘App’ 84306864 error] [vm.GetConfig] Received exception in GetConfig: vmodl.fault.SystemError
[2007-06-21 16:19:28.803 ‘App’ 84306864 error] [VpxaVmprovUtil] Unable to lookup datastore EVA4000_VI3_VMFS6
[2007-06-21 16:19:28.803 ‘App’ 84306864 error] [vm.GetConfig] Received exception in GetConfigSnapshot: vim.fault.InvalidDatastorePath
[2007-06-21 16:19:28.950 ‘App’ 84306864 error] GetResult failed: not well-formed (invalid token)
[2007-06-21 16:19:28.950 ‘App’ 84306864 warning] 3. Looked at the vmInventory.xml file to find which VM that vmID belonged to. 4. Looked at the vmx file of this VM. One of the parameters SMBIOS.reflecthost had garbled information. Changed SMBIOS.reflecthost=TRUE 5. In order for the VM to take this value we did vmware-cmd Vmname setconfig SMBIOS.reflecthost=TURE. 6. Now we could add the host back into the VC server.]]>

/etc/hosts and DNS is very damn important for VMware HA!

I really wish that AAM had better error messages. The error in the add_node_config or something log showed that it couldn’t bind to the port. The issue was that it couldn’t bind to the port on a different host! The reason was because DNS or /etc/hosts file had a different IP from what was vswif0. Make sure that your hostname resolves to your vswif0 IP!

How to configure HA failover custom settings on VI3

“FT_DIR=/opt/LGTOaam512/”
“FT_DOMAIN=vmware”
“export=FT_DIR FT_DOMAIN”
“cd /opt/LGTOaam512/bin”
“./ftcli” Now you should see the AAM prompt AAM> “AAM>getfailuredetectioninfo” You will see something like this: Heartbeat Interval: 1000
Heartbeat Timeout: 15000
Heartbeat Port: 8044
Multicast Address: 224.0.6.127 You can change the time out by: “AAM> setfailuredetectiontime 10000” To change this to 10 seconds You need to do this on all the members of the HA cluster.
The failover should kicks in by default setting when one member of the HA cluster lost network or powered down Hope this helps]]>

ESX 3.0.x – vmware-hostd is not cool …

If you’re using autostart for your VMs. You’ll have to be very careful because it will SHUTDOWN your VMs!

The way autostart works in 3.0.x is that your autostart will automatically start the VMs with hostd and shutdown with hostd, so you don’t want to be restarting mgmt-vmware if you’re using autostart for your VMs.

fails to deploy templates …

interesting iSCSI – started w/ snapshot luns / resignature

“Error: Invalid vmhba name at position 1” uhhh … okay … And when you try logging into VC, vpxa crashes and you get:
Failed to serialize result of method vmodl.query.PropertyCollector.waitForUpdates: You get “Failed to serialize result” when logging into the host directly via the VIC as well, but it doesn’t crash vmware-hostd. So now what??? Well, we checked the SAN and it showed that the LUNs were presented properly. Then, we found that running:
killall -HUP vmkiscsid and then running:
esxcfg-rescan vmhba40
got us going again. Of course, we got the snapshot LUN problem again, so we just set the DisallowSnapshotLun to 0 and EnableResignature to 1 and then rescanned and it resignatured and changed the values back immediately after.]]>

How to Troubleshoot ESX 2.5.x by loading vmkernel manually

# chkconfig vmware off This will let you boot into ESX without starting the VMkernel. Reboot the server and allow it to boot into the standard “ESX” mode. You will notice that on the next reboot that although ESX was selected, the typical VMware services will be skipped. This provides you with a clean slate to manually step through the process of loading the VMkernel to narrow down the root cause of your boot issues. 1. Load the vminx module:
# /sbin/insmod -s -f vmnixmod You will get a message about tainted drivers, which can be ignored.
2. Load the VMKernel itself:
# /usr/sbin/vmkloader /usr/lib/vmware/vmkernel 3. Allow the VMkernel to run Linux drivers:
# /usr/sbin/vmkload_mod -e /usr/lib/vmware/vmkmod/vmklinux linux As we understand it, this is the step in which the final transformations are occurring to load the management console as a virtual machine.
4. Make sure all devices are enumerated:
# /usr/sbin/vmkchdev -n The next steps would be system specific based on the hardware installed in the system. This is typically where we see a majority of the issues while loading the VMkernel. If the system freezes while loading a specific module, you have narrowed down your issue to a very specific portion of the boot process and further investigation may be performed with VMware support or other methods. To review which modules need to be loaded, check the contents of your vmkmodule.conf file:
# cat /etc/vmware/vmkmodule.conf We will utilize one of our servers as an example configuration. vmklinux linux
nfshaper.o nfshaper
bcm5700.o vmnic
e1000.o vmnic
aic79xx.o aic79xx We are now going to load the drivers one by one using vmkload_mod. Since the vmklinux module was previously loaded in step 3 above, it is not necessary here. If a module is commented out, it is not required in this step.
Load the packet shaper driver (This is disabled by default) # /usr/sbin/vmkload_mod /usr/lib/vmware/vmkmod/nfshaper.o shaper Load an Intel e1000 network adapter
# vmkload_mod /usr/lib/vmware/vmkmod/e1000.o vmnic Load a Broadcom BCM5700 network adapter
# vmkload_mod /usr/lib/vmware/vmkmod/bcm5700.o vmnic Load a SCSI adapter
# vmkload_mod /usr/lib/vmware/vmkmod/aic79xx.o aic79xx If any one module hangs the system, you have found your culprit. A complete list of steps followed should be documented in the event a support call needs to be opened with VMware. The above steps will help narrow problems to a specific area. If the system starts as expected without error in the above process VMware support should be consulted to help further analyze why a particular system may hang during its boot process.
When all is said and done, do not forget to re-enable the VMkernel services on startup with the following command:
# chkconfig vmware on]]>

couple of things to look at for AD auth in ESX 3

http://www.vmware.com/pdf/esx_authentication_AD.pdf
You should additionally check for:
1) Firewall
/usr/sbin/esxcfg-firewall –allowoutgoing –openport 389,tcp,out,in,LDAP We need to allow outgoing and outgoing for port 389. 2) Time.
It’s probably best to sync time with the AD server with NTP. Just configure the /etc/ntp.conf and /etc/ntp/step-tickers files with the AD.]]>

use cat /dev/null instead of rm

1) I check disk space.
[root@supp01 Adon_RHEL_4]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 1011M 464M 496M 49% /
/dev/sda1 99M 12M 82M 13% /boot
none 133M 0 132M 0% /dev/shm
/dev/sda6 494M 8.1M 460M 2% /tmp
/dev/sda2 2.0G 541M 1.3G 29% /usr
/dev/sda3 2.0G 614M 1.2G 33% /var
/dev/sda8 61G 960M 57G 2% /vmimages 2) I see that the VM is running and there are processes that have the file open.
[root@supp01 Adon_RHEL_4]# fuser vmware.log
vmware.log: 571 572 573 19874 19875 19882 3) I fill up the file.
[root@supp01 Adon_RHEL_4]# cat /dev/zero >> vmware.log
cat: write error: No space left on device 4) The filesystem is full.
[root@supp01 Adon_RHEL_4]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 1011M 1012M 0 100% /
/dev/sda1 99M 12M 82M 13% /boot
none 133M 0 132M 0% /dev/shm
/dev/sda6 494M 8.1M 460M 2% /tmp
/dev/sda2 2.0G 541M 1.3G 29% /usr
/dev/sda3 2.0G 614M 1.2G 33% /var
/dev/sda8 61G 960M 57G 2% /vmimages 5) I wipe the file out.
[root@supp01 Adon_RHEL_4]# cat /dev/null > vmware.log 6) I no longer have a full filesystem anymore.
[root@supp01 Adon_RHEL_4]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda5 1011M 464M 496M 49% /
/dev/sda1 99M 12M 82M 13% /boot
none 133M 0 132M 0% /dev/shm
/dev/sda6 494M 8.1M 460M 2% /tmp
/dev/sda2 2.0G 541M 1.3G 29% /usr
/dev/sda3 2.0G 614M 1.2G 33% /var
/dev/sda8 61G 960M 57G 2% /vmimages 7) Processes still have the file open.
[root@supp01 Adon_RHEL_4]# fuser vmware.log
vmware.log: 571 572 573 19874 19875 19882]]>