HCI – Hero From Day Zero

After a great reception at NetApp Insight 2017 (it was so good that actual orders pushed back our demo system), and thanks to NetApp, I have finally got my hands on their exciting new portfolio product.

First Impressions

We received an 8-node setup, 4 storage and 4 compute, which turned up as a pallet of IT equipment, which was a little unexpected at first, but upon review, it does mean that the hardware is a lot more manageable to get from the box into the rack. It all comes nicely packaged in NetApp branded cartons. The storage nodes also have the disks individually packaged for adding into the chassis.

So, upon first inspection of the blades/nodes, I can see NetApp have partnered with a hardware vendor who is renowned for producing server hardware. They feel sturdy and are well crafted. Adding them into the system is a smooth process and doesn’t need any excessive force, something I have seen with other blade systems in the past. Starting from the bottom and working up, we racked the two chassis to begin with. The important thing to note is the 3 strips of protective clear plastic film along the top of each chassis MUST be removed before installation. Once racked, it was on to adding the additional nodes into the chassis. We opted for a two and two approach with the two compute nodes in the top of the chassis and two storage nodes below.

The reason for this was there is extra air flow via the top of the chassis (hence removing the film) which will be of benefit to the compute nodes. But this is only a recommendation, any type or size of node can occupy any of the available slots. If you add a storage node to the configuration then you will also have to insert the accompanying drives. Again, make sure you add these into the corresponding bays in the front

Getting Setup

In preparation for deploying our HCI equipment, we have also deployed a management vSphere cluster (6.5) and in here amongst other things, we have created our PDC and SDCs each sharing responsibility for AD, NTP, DHCP for both the mgmt. and iSCSI networks, and most importantly, DNS. I can’t stress enough when it comes to networking: 9 times out of 10, it’s a DNS issue. Make sure you get your forward and reverse lookup zones correct.

What I have learned from the time I have spent with the NetApp HCI platform is understanding what this system requires from a networking perspective and setting that up is key to a successful deployment. The team and I had reviewed the documentation available on the NetApp support site (the prerequisites checklist and the installation workbook), yet our first attempt failed at 7% which we traced to a VLAN configuration issue on the switches. After that, it was pretty much plain sailing.

As you can see from below, we left a monitor connected up to one of the compute nodes and we can see it deploying ESXi in standard fashion.

We have factory reset the HCI kit several times to get an understanding of the different options during the NDE process, and it’s fair to say they are pretty self-explanatory (each option has a blue “i” next to it which goes into detailed information as to what you are configuring). One thing we did note is using the basic networking wizard and then flipping over to the advanced helped pre-populate pretty much all the fields, but gives you more control of what is assigned. We wanted to move the mNode from next to the VC to next to the MVIP for the SolidFire Cluster and simply changing the digits of the last octet turned the box red as unverified. To enable the engine to check the IP against everything else on the page and check it’s not in use requires you to delete the decimal point associated with that octet. You also can not separate the vMotion and management subnets without the use of a VLAN tag. So if you don’t add the tag before trying to separate these, it can be a bit unclear as to the engine’s methods without understanding how the physical network topology is designed to interact with the HCI platform. It’s good to see also that you cannot proceed until everything is properly inputted. Another handy feature is the ability to download a CSV copy of all the variables (passwords are redacted) just before you hit deploy.

By repeating the setup process, we got an idea as to the timing it takes and from your final review of the NDE inputs and clicking “Looks Good, lets Go,” we were seeing 6.0u3a deploy in just over 35 minutes and 6.5u1 going to the 55-minute mark. When watching the progress bars, it’s clear to see more time is spent deploying the VCSA with 6.5 which probably explains why it’s a lot easier to use and less buggy than its predecessor; I have been trying to move over to the appliance for a while now and with the work I have been doing with this HCI platform and 6.5 I am now a convert.

Up and Running

Once the NDE is complete you can click the blue button to launch vSphere Client which will connect to the FQDN as entered during the NDE. Once we are logged in to the client, we can see from the home landing page that the plugins for the SolidFire have been added – NetApp SolidFire Configuration (for adding SF clusters, turning on vVols, user management, joining up to mNode and NetApp SolidFire Management (for reporting, creating datastores and vVols, adding nodes drives etc)

NDE with create a DataCenter and containing Cluster with HA and DRS enabled and add the hosts to this. It also creates two datastores on the SolidFire cluster of 1.95TB in size and VMFS6 with SIOC enabled. Sadly, the current management plugin will only create VMFS v5 for any datastores you wish to create after initial deployment, so if you need/want v6 then you are going to have to destroy and recreate the newer version onto the LUN, a minor issue but could become laborious is you have quite a few datastores. What is nice though is you can configure the SolidFire cluster to provide vVols & datastores at the same time, and with it being a SolidFire back end, you get the guaranteed quality of service you expect for any storage provide from that platform.

Take Away

I have to say that I have been impressed by the HCI platform. From getting it in the door to racking and stacking and then progressing through the NetApp Deployment Engine, it has become a smooth and risk-free process. The guard rails of the NDE allow for a robust vSphere deployment yet allow you to tweak parts to fit it to your environment (e.g. you don’t have to deploy a new vCenter, you can join an existing) I also have mentioned above that it has helped win me over to using the VC appliance, and there will be no going back now for me. Having spent time working on the kit, I can fully understand the reasons NetApp have made in providing separate storage and compute nodes, and I am confident that customers will also see the benefit to truly flexible, independent scalability in an HCI deployment, not to mention the performance and fault tolerance of a shared nothing architecture. I look forward to the first customer reviews, and from the amount of quotes Arrow have been putting together recently on this product, it’s not going to be long before it is established as a leader in this market segment.

Next steps

So when it came to time to test the HCI platform I was chatting to a friend who informed me there was a VMware fling that could help. Now I had heard about the wonderous nature of the flings whilst listening to @vPedroArrow and @lost_signal on the Virtually Speaking Podcast a while back but in my current line of work hadn’t a need to use them until now. In my next post I will go into more detail on these and look at some of the results that I received.