VMware ESX 3.x Snapshots

Posted on 12 Oct 2009 by Ray Heffer

Snapshots are a fantastic way of providing a quick and reliable method of rolling back the state of a virtual machine, should something go astray following an patch or update. VMware VCB also uses virtual machine snapshots to quiesce the VM prior to taking the backup data.

However, in larger environments where there may be tens or hundreds of VMware ESX servers, snapshots can also be a pain in the backside if there is no control over who is using them. Why? Because snapshots work by creating a delta VMDK that records the changes in blocks, a process called copy-on-write (COW). Over time the delta VMDK file will grow, and depending on the level of I/O within the VM it could grow faster on some virtual machines and not others. The danger only presents itself if the datastore where the VMDK resides reaches it’s capacity. When this happens, virtual machines that are not thin-provisioned should continue to run with no problems, but think about these situations:

  1. You have other virtual machines in the same datastore using snapshots.
  2. You have one or more virtual machines on thin-provisioned disks.
  3. You have powered off virtual machines, that need to be powered on.

In all of the above scenarios if the datastore is full then the affected virtual machines will be suspended (paused). Virtual machines with thick-provisioned disks will continue to operate as the VMDK already has the full allocated of storage space available. Virtual machines that are powered off, and need to be powered back on will fail as they won’t have enough disk space to create the virtual swap file.

The simple rules to follow to avoid these situations is:

  1. Allow enough free disk space in each datastore for the snapshots to reside. If snapshots are only ever created during a VCB backup, and the virtual machines have little I/O then you won’t need as much space as busy environments.
  2. Monitor the use of snapshots in your environment. I’ve included a command you can run to find snapshots below.
  3. Don’t forget about snapshots and leave them in place. They will get bigger!
  4. Change the permissions to restrict who can take virtual machine snapshots. The fewer the better, and try to establish some internal procedures to make IT staff aware of the potential issues.

A simple, yet useful command you can run on each ESX server to find snapshots is this:

Login to your ESX server and change the working directory to /vmfs/volumes. Then use the grep command to find vmsn (VMware Snapshot) files.

# cd /vmfs/volumes # ls -Rh | grep "vmsn"

This can be really useful, especially when you don’t know who is creating snapshots and forgetting they are there!