02 Jan 2013 by rayheffer
For my first post of 2013, I have decided to dive straight into sizing for VMware View 5.1. If you are planning a VMware View implementation then at some stage you will need to look at sizing, and calculating factors like how many desktops per View desktop pool, in addition to network configuration and storage considerations. The purpose of this article is to discuss sizing and configuration maximums for VMware View 5.1. Since VMware ESX 3.x, a configuration maximums document has been published by VMware for each version of vSphere that details the supported maximums for networking, compute, storage, vCenter, host, and even vCloud Director. Because there is no single ‘configuration maximums’ document for VMware View 5.1, I have included reference documents and material at the bottom of this article.
How many linked clones per replica can I have? How large (or small) should the network subnet be? In this post I will answer these questions and give you guidance for sizing your View deployment correctly. Let me start by first highlighting the View 5.1 configuration maximums. View 5.1 Configuration Maximums
Note: To avoid confusion there is a new VAAI feature in View 5.1 (currently tech preview),View Composer Array Integration with NFS snapshot technology. This uses NAS snapshots for View linked clones, whereas previous implementations of VAAI with View 5 didn’t actually integrate View Composer at all. With this new feature the maximum number of NAS snapshots will be vendor dependant, but theoretically thousands of linked clones per datastore should e possible. Already vSphere can offload to VAAI for operations like thin provisioning, Atomic Test & Set (ATS), Clone blocks, and full copy. I have always remained cautious to the value of VAAI in a View environment using linked clones, so I tend not to exceed the stated 128 linked clones per datastore until the technology matures or has been benchmarked with a higher number.
Throughout this post I will be making the assumption that you will be using View Composer linked clones, as full clones are not subject to many of the configuration maximums, and calculating for full clones is far more simple. In these examples I will base it on a deployment of 2,000 Windows 7 desktops. This is NOT a sizing calculator, rather a sizing guide. Let us start by breaking down the 5 key sizing areas:
The first and most obvious area is the master desktop image. You need to define the sizing for your virtual desktop as this has a bearing on the other areas such as storage and host sizing. In the sizing calculations section below I have used typical values for an average office user, but is there really an ‘average’ office user these days? This will obviously change depending on the environment, and this is where I strongly recommend that you define accurate virtual desktop sizing by carrying out an assessment of the existing environment. The desktop should be as light as possible as this will simplify management of the master image, so consider delivering applications with VMware ThinApp so applications can be patched or updated separately from the master image.
Depending on the results of the desktop assessment, a Windows 7 desktop will typically require 1.5GB – 2GB RAM and an average office worker will consume an average of 300Mhz of CPU. This should only be taken as a guide, which stresses the importance of an assessment to identify application demands on CPU, RAM and IOPS.
Note: The virtual swap file for each virtual machine is equal to the amount of unreserved RAM (E.g. 2GB), therefore you can reduce this if you create a memory reservation equal to the amount of configured RAM. This may be important if you need to conserve as much disk space as possible. Remember that your hosts will still need to be sized appropriately, taking into account the event of a host failure to ensure you have enough physical host RAM available.
The two main considerations here are DHCP scope and subnet size. If you decide to choose a /24 (Class C) subnet, you will have up to 254 host IP addresses available for your desktops. Therefore this will result in having multiple View desktop pools of 254 virtual desktops each, as single a desktop pool can only be associated with a single VLAN / subnet. If you choose a larger subnet such as a /23, this will give you up to 510 usable IP addresses per subnet, and consequently 510 desktops per pool.
Whilst technically there is nothing stopping you from using a subnet of more than 1,000 IP addresses, this would result in the desktop pool size exceeding the View Composer maximum of 1,000 desktops. This may also result in large amounts of broadcast traffic, and Cisco recommend that the number of devices per VLAN should not exceed 512. I have included a link in the references section below, where Cisco state: “There is the potential for large numbers of devices within a single VLAN or broadcast domain to generate large amounts of broadcast traffic periodically, which can be problematic. A good rule of thumb is to limit the number of devices per VLAN to about 512, which is equivalent to two Class C subnets (that is, a 23-bit subnet masked Class C address).”
You also need to ensure your DHCP scope is configured appropriately for the size of the subnet. If you decide to use a /23 subnet, and have 2,000 desktops then you would need four desktop pools, and therefore 4 x /23 DHCP network subnets.
Once you have decided on the network configuration for your environment, you can determine the required number of desktop pools. As given in the previous example, a /23 subnet will dictate a desktop pool of up to 510 virtual desktops. Bear in mind that the maximum desktop pool size should be no larger than 1,000 virtual desktops. Each desktop pool will create a single replica disk, so if you have 4 x desktop pools then you will need 4 x replica disks so bear this in mind before you design your storage configuration.
The number and size of desktops pools isn’t totally driven by the network sizing though, as manageability of the View environment has a big part to play here. If you want to update any changes you have made in the master image and perform a recompose operation, a pool containing 1,000 desktops may take longer than a smaller pool.
If you are using tiered storage and have a high (read) performance datastore for replica disks then you can separate the replica from linked clones using different datastores. Do not separate the replica disk if your storage array doesn’t offer a separate high performance datastore as you’ll only introduce additional overhead. Given the number of desktop pools, and whether you will separate the replica and linked clone data stores you can now calculate the number of LUNs required. See how I’ve done this in the Sizing Calculations section below.
We can’t discuss storage without talking about IOPS, and sizing for this is a critical component to your design. When I’ve run load-tests with View Planner I tend to see a ratio of 30% reads / 70% writes. In the real world I tend to see closer to 10-20% reads and 80-90% writes, but that does depend on the workload profile of the desktop user.
Boot / login storms and steady state (typical usage in the day) are two areas that must be understood. Carrying out an assessment of the existing environment will give you an indication of the user login times, and if everybody starts promptly at 9am then you’ll need the storage solution to deliver a very high number of read IOPS in a short space of time. Technology such as the View Storage Accelerator which leverages Content Based Read Cache (CBRC) on vSphere 5 hosts can eleviate the hit on read IOPS, however, this only provides 2GB of read cache, so I wouldn’t solely rely on this to deliver all of the required read IOPS. If your storage vendor offers an SSD cache such as the EMC Fast Cache feature, then this will deliver a much higher capacity of read and write IOPS.
I am often asked how many virtual machines should be placed on a single host. Well, this does depend on the RAM available and number of processor cores. Based on experience don’t have more than 8-10 virtual machines per CPU core, and if you do make sure you carry out load-testing to see what the performance imapact is. On a host with 4 x quad core processors, this would be between 128 – 160 virtual machines. I’d like to stress the importance of load-testing here, as it can often be overlooked. It’s all very well saying that we will run 160 virtual machines on a single host, but what type of workload is the user running?
Using a tool such as VMware View Planner or Login VSI, you can simulate user workloads and by monitoring the performance of your hosts you can determine the maximum per core based on your workload and server hardware. In the real world I typically see 125 – 145 virtual machines per host for an average office worker desktop. The key performance metric to be aware of is CPU Ready (%RDY), and monitoring host performance during a load test will indicate whether you’ll incur potential issues. Anything above 3-5% in a VMware View environment may impact user perception of performance.
By default, vCenter will alert when a host reaches 80% of memory or CPU utilisation. In a VMware View environment I feel that 80% is too conservative and I prefer to increase this to 90% for CPU and 95% for memory. Whilst this may seem rather high, remember that if you are hosting non-persistent virtual desktops you want the best consolidation ratio possible. Unlike with server workloads where in the event of a host failure HA (High Availability) will restart virtual machines on another host in the cluster, you want to make sure you have enough ‘spare’ virtual desktops in the pool. If a host fails, users will be disconnected but they can log back in and connect to another spare virtual desktop in the pool. This is ideal for non-persistent dekstops using View Composer, but this won’t work if you require persistent desktops as you’ll need to restart a persistent virtual desktop on another host, so bear this in mind.
|Total RAM (MB)||393,216||384GB RAM (384 x 1024)|
|Number of sockets||2|
|Number of cores per socket||10|
|Mhz per core||2,700|
|Total CPU Mhz||54,000|
|CPU Available (90% Utilisation)||48,600||10% overhead|
|RAM Available (95% Utilisation)||373,555||5% overhead|
|System drive (GB)||25||Windows 7 System (C:)|
|Persistent data drive (GB)||0||Not required in this example|
|Number of vCPUs||1|
|Average CPU Usage (Mhz)||300|
|Average Memory Utilisation||1,024|
|Configured RAM (MB)||1,946||1.9GB (1.9 x 1024)|
|Memory reservation||1,946||Eliminates virtual swap|
|Estimated memory overhead @ 2.5% (MB)||49|
|Total RAM Required||1,994|
|Average Peak IOPS||44|
|Host Desktop Sizing||Value||Notes|
|Total CPU Required (Mhz)||600,000||12|
|Total RAM Required||3,988,480||11|
Using these calculations we would take the higher number of hosts required, in this case 12, and given that the maximum number of hosts in a View 5.1 cluster is 8 when using VMFS storage, we would need to split this into two clusters of 6 hosts each. This is an ideal illustration of the benefits when using NFS data storage as the cluster limitations are not applicable and we could have a single cluster of 12 hosts. Also, based on the guidance above I have decided to use four x /23 bit subnet masks, and two desktop pools per cluster (four in total).
|Number of hosts required||12||Can use in a single cluster if using NFS|
|Peak IOPS Required||88,000||Combined read/write|
|Average IOPS Required||20,000||Combined read/write|
|Number of desktop pools||4||Based on 510 desktops per pool|
|Number of replica datastores||4||Using separate high performance datastores|
|Number of linked clone datastores||16||2,000 / 128|
|Size of replica disk (GB)||25|
|Size of linked clone (GB) at 15% growth||4||15% growth of parent image|
|Datastore size for linked clones (GB)||480||Linked clone size x 128|
|Datastore size for replica disk (GB)||98|
Comments are closed for this post.