01 Nov 2011 by rayheffer
Whilst working on a Vblock 300 implementation a few weeks ago I had an interesting conversation with one of the network architects at VCE and we discussed the subject of best practices surrounding 10Gb and 1Gb networking. Traditionally with 1Gb networking it is best practice to separate traffic on your ESX/ESXi hosts with vSwitches (or dvPortGroups) dedicated to each type of traffic (vMotion, Management, Storage, production networking) and typically designs will contain 6 to 8 NIC’s per host. With the introduction of 10Gb networking, I’ve noticed that some implementations have neglected to include some important design considerations regarding the use of 10Gb networking. Lets say for that we present 4 x 10Gb NIC’s to each host (these are vNIC’s in the Cisco UCS world) or we can present 6 x 1Gb NIC’s using traditional methods of separating the traffic into various dvportGroups. Which is best? Can we get away with just 2 x 10Gb NIC’s or do we need more? The key consideration here isn’t how many NIC’s (or vNIC’s) are presented to each host, but rather how much network bandwidth is available to each traffic type (i.e. vMotion, FT Logging, VM traffic) and critically how we control it.
Best practice designs often take a while to stick in the mind and more often than not we forget the all important question of WHY it is best practice in the first place, we are all human after all.
Here is what I mean; Going back a few years I remember explaining to friends and colleagues why vMotion, Storage, Management and production VM traffic needs to be kept physically separate (not just on different VLAN’s). Now don’t get me wrong there was always plenty of documentation to support this, but there were still a high proportion of IT professionals still using a single vSwitch and only two 1Gb uplinks in their designs. A few years later it became widely accepted as best practice to separate the traffic, but now with 10Gb networking we have similar problem and one magic ingredient is often overlooked, and that ingredient is traffic shaping!
The reason we need to introduce network traffic shaping with 10Gb networking is because uplinks can (and will) get saturated. For some reason some folks think that the limits of 10Gb will never be reached and a pair of 10Gb NIC’s on each host (with no traffic shaping) will suffice. A good example of where saturation can occur is vMotion, see the graph on the left (click to enlarge). vSphere 4.0 vMotion traffic can achieve up to 2.6Gbps per vMotion (maximum of two concurrent) so it would never max out a 10Gb link, but since vSphere 4.1 a single vMotion can now achieve up to 8Gbps, and can concurrently perform eight vMotions. In fact using vMotion on vSphere 5 can theoretically achieve up to 40Gbps with multi-NIC vMotion (using multiple vmKernel interfaces across four adapters) – by the way I’m not aware of this being tested, but it would be interesting to see it in action. Now that is pretty impressive if you want super fast vMotion, but lets not sacrifice other traffic such as our storage (NFS, iSCSI), FT logging or production networking. The question I always ask is what are the requirements we’re putting into our design? There is no right or wrong answer of how our traffic needs to be prioritised but there are some good guidelines available.
So, enough talk… lets design our 10Gb converged infrastructure for vSphere!
There are a number of methods you can use to prioritise your network traffic, and these are (in my opinion) the most popular:
Lets look at each of them in more detail before I conclude with my recommendations.
Since vSphere 4.0 we have been able to configure traffic shaping policies on the vDS (Ingress and Egress) and we are able to control Average Bandwidth, Peak Bandwidth and Burst size. This is no match for traffic shaping for 10GB networking. NetIOC, however, which was made available in vSphere 4.1 (with Enterprise Plus) allows for a form of QoS tagging (Quality of Service) using ‘shares’ and shapes traffic leaving the host. The diagram (left) from What’s New in VMware vSphere 4.1 Networking illustrates this.
Shares and limits define the prioritisation for the following pre-defined network resource pools:
Note: vSphere 5 NetIOC also allows for user defined network resource pools.
What I really like about NetIOC is that it uses the concept of Shares and Limits which you may already be familiar with (I.e CPU and Memory), and it’s very easy to use and setup (you access it from the Resource Configuration tab of your dVS). VMware have published an excellent whitepaper on NetIOC which I recommend reading, even if you’re not planning on using it, as it details some of the reasons for traffic shaping that I’m outlining in this blog post.
I find the Cisco Nexus 1000v to be extremely powerful and I see a growing number of organisations using it which is fantastic as I’m a big fan of Cisco products. Network folk will love using the Cisco Nexus 1000V as it hands control back to them and it is what they know and love, especially shaping traffic with QoS. Since version 1.4 the N1KV has a feature called CBWFQ (Class-Based Weighted Fair Queuing) which is similar to using shares and essentially allows you to define traffic classes based on match criteria. It also ties in very well with Cisco UCS (see the next section). If you are familiar with Cisco products then using class-maps and matching them to a protocol is nothing new, and if you’re new to Cisco products then it’s simply awesome :)
The following pre-defined matches are available:
It goes without saying that you can create your own user-defined matches, but the main differentiation between using the Nexus 1000V and NetIOC is where and how the traffic shaping takes place. Essentially the Nexus 1000V will mark CoS tags on upstream traffic, NetIOC won’t do this, so ingress network traffic could potentially saturate the 10Gb adapters. Brad Hedlund has an excellent post on exactly this topic, and I have included an snippet from his article below which highlights some important points:
…the Nexus 1000V can apply a bandwidth transmission policy while at the same time marking the traffic for consistent treatment in the upstream network. In contrast VMware Network I/O Control does not mark any traffic. As a result the upstream network must either independently re-classify the traffic or not do anything at all and just “let it rip”. If you do the later “let it rip” approach you will have a situation in which there are no controls in how the network transmits traffic to a server. As a result you can either overwhelm your server with receive traffic and defeat the purpose of implementing NetIOC to begin with. Or, you can implement receive based rate-limiters on every VMware host in fear of excessive traffic and again defeat the purpose of intelligent bandwidth sharing that NetIOC sets out to accomplish.
If you are using Cisco UCS then you can configure QoS with the Palo / VIC (Virtual Interface Card). There are in fact two ways of doing this; one is to present vNIC’s to the hosts each with a QoS policy defined, or you can leverage the Nexus 1000V using that to define the CoS tags, and the Cisco UCS will honor those tags rather than the VIC defined CoS. Each has it’s benefits and will again depend on your requirements. If you use the Nexus 1000V as described before, but use the QoS capability of the Virtual Interface Card (Host Control setting), you only need to present two vNIC’s to each host, and the Virtual Interface Card will still process the QoS buffers taking full advantage of the Cisco hardware.
Configuring QoS on Cisco UCS Virtual Interface Card is pretty straight forward to do, as it has six pre-defined system priorities:
Essentially you create your own QoS Policies and select a priority (E.g. Platinum or Gold) for each QoS Policy you define, along with options such as burst (in Bytes), rate (Kbps) and Host Control (setting to Full allows for use of Nexus 1000V). Each QoS Policy you create is then assigned to a vNIC template (one per vNIC). For example, you might end up with six vNIC’s:
This would present 6 NIC’s to each host (3 per fabric interconnect via the IO module), unless you are also leveraging the Nexus 1000V in which case your QoS is configured on the N1KV and you only need to present 2 x 10Gb NIC’s (one for each fabric interconnect). In addition you would create a QoS policy for your fibre channel storage (using the Palo card, not the 1000v!), and that’s it, simple.
There is no single design or best practice that fits all, but there is guidance that will help you make the right choices for your environment. Neither the Cisco Nexus 1000V or NetIOC are not able to prioritise FCoE traffic as essentially this is fibre channel as far as they are concerned, so keep this in mind when using QoS on the Nexus 1000V or with NetIOC. You don’t want to end up with a situation where you’re CoS tagging traffic but FCoE could still impede on the 10Gb link.
The Nexus 1000V is really suited well when network management is the responsibility of the network team, or when you require not only QoS on egress traffic (leaving the host) but want to apply CoS tags for ingress. In addition, using the Nexus 1000V for QoS tagging means that you only need to present 2 x NIC’s to each host whereas using Cisco UCS Palo card (Virtual Interface Card) for QoS you’ll need to present 6 or more NIC’s for each type of traffic (on each fabric interconnect).
IMHO it is better to have implemented traffic shaping than use nothing at all, so if you choose to use NetIOC then you’ve made an excellent choice. This is where some say NetIOC only solves part of the problem, so why wouldn’t you want to shape both egress and ingress traffic? Just bear in mind what the differences are as I’ve outlined.
Whichever method you use to shape your network traffic for 10Gb networks, here are some guidelines and key considerations:
The important thing to remember is the values stated here are for guidance only and have been derived from various sources and feedback from customer implementations I have worked on. The illustration to the left shows how a 10Gb interface can be prioritised for vSphere traffic. If your design doesn’t require NFS or FT Logging then you can tweak these values to your own requirements, it really depends on each environment. Doing nothing at all with vSphere 4.1 or later is not recommended as you now know, because vMotion can end up saturating the 10Gb links.
I’ve focused on the Nexus 1000V, Cisco UCS Virtual Interface Card and VMware NetIOC. I have chosen to focus on these three technology sets because of their growing popularity and my present involvement working on Vblock implementations. However, I hope you can apply this knowledge to your designs or existing infrastructure.
Comments are closed for this post.