Vmware ESXi Snapshots Best Practice

AddThis Social Bookmark Button

This is probably not the first or last time I will see this but I had the misfortune of discovering the use of VmWare ESXi snapshots as a disaster recovery method recently. From experience I can warn you that while this might seem like a good idea at the time it can cost you a lot of time, money and most importantly unhappy customers in the long term. If you are using snapshots in your production environment you might want to read on.

 

What are snapshots in Vmware?

Snapshots are a way of capturing a virtual machine at a current point in time. They are a fantastic way to capture the current state of a virtual machine to test changes, take a backup or perform risky changes.

So why are they bad?

Snapshots are not backups. They are actually not even a differential or incremental record of a hard disk but they are in fact a change log. When a virtual machine snapshot is taken the current hard disk is frozen in time and a new snapshot file is created. From this point on your virtual machine is running on this snapshot. This means you are creating a chain of disks so that when your machine starts it has to access the original hard drive then the snapshot hard drive to boot the virtual disk. While the virtual machine is not running on the original VMDK it does rely on it to exist and for the snapshot chain (depending on how many you have) to be in tact. If anything damages this chain, you WILL lose data. Snapshots are useless if the original disk is damaged. While you are running from snapshots you are reading and writing multiple virtual hard drives increasing disk I/O, reducing response time and generating a growing change log. This change log records all changes you make so if you copy the same file 5 times, your snapshot will increase by 5 times the size of this file even though on a normal hard disk you would only use the amount of space of the single file. Snapshot files can grow to the same size as the original disk consuming more disk space. When committing/deleting snapshots the server takes the change log and commits all changes back to the original disk. Depending on the size of the snapshots and the number of snapshots in the chain this can be a take a long time and be risky if any commits fail. Snapshots on high transaction machines can grow dramatically in size. Using vmotion on virtual machines with snapshots to avoid damage and data loss.

So when should they be used?

Snapshots are great to use when performing a software release, testing new configuration changes or deploying major updates. They should be committed as soon as you have confirmed the release has worked.

So what do I need to know?

Follow vmware best practice:

  • Use only 2-3 snapshots in a chain
  • Commit snapshots within 24-72 hours
  • Check snapshots created automatically by backup applications are committing correctly and regularly
  • Set vSphere alerts to notify you of virtual machines running on snapshots
  • Check for the existence of snapshots before modifying virtual machine hard disks or using vmotion
  • Ensure you have sufficient storage before creating or using snapshots
  • Avoid using snapshots on high transaction virtual machines such as database or email servers.

Need more information? VMware KB 1025279

 

Follow these simple steps to ensure optimal performance for your virtual machines and server and reduce the risk of losing your valuable data.