Take a Second Look at NetApp HCI It Will Surprise You Part 3

In our first two parts we have looked at the HCI landscape and how NetApp HCI does things differently and what advanced features it has over the other vendors in this space. In this final piece we look at how to truly position NetApp HCI to win.

HCI or not HCI

There are some vendors and even some analysts pushing back and saying: “well NetApp’s HCI isn’t true HCI “to which you should ask them “What is true HCI?” or “what is the definition of HCI” to which you can reply “there isn’t a single definition of HCI.” It is also worth noting that Gartner states: “NetApp is a leader in network-attached storage and has acted as an Evolutionary Disruptor by introducing its NetApp HCI solution.”

In a recent 451 Research survey they asked respondents the role they see HCI would have within two years to their organisation’s infrastructure deployment strategy? It was interesting to see that 71% said simplify infrastructure management and maintenance and the next two top answers were accelerated provisioning and or optimisation; and support a hybrid cloud strategy.

Improved hardware integration simplified management and reduced TCO (Total Cost of Ownership) of HCI is driving its adoption. Customers are asking for simplicity from a single vendor that they can hold accountable and reduce the overall complexity of their environments. They require hardware that is agile and allows them to realise the power of software defined infrastructure. They are looking for new ways of transforming their purchasing, looking at ways that will move them from a large capital expenditure (CAPEX) to an operational expenditure (OPEX) or even a pay as you go cloud like model. They want a solution that can lower the total cost of ownership and reduce the time to see a return on investment (ROI) and then some. They are asking for a platform that has a granular incremental growth model that can properly make use of scale out economics. Then to top this off the purchase is probably being driven by the virtualisation or line of business teams rather than the infrastructure compute or storage teams of old. So if these are the responses we are getting from customers they does NetApp HCI not conform to the ideal??

Hybrid Cloud

There are three drivers of digital transformation, the speed at which you can innovate, how quickly you can bring that to market, and improved customer experience. So if NetApp HCI can supply seamless scalability, portability, performance and easy automation; if it can connect to public cloud providers, manage data ingress and egress to the public cloud; run applications and microservices; if it can solve the problems that IT professions see will plague them within two years today; if it can overcome performance bottle necks and increase VM consolidation and avoid various first generation HCI taxes all the while let IT organisations evolve to a mode 2 operation strategy why would you not want your customers to get their hands on this wonderful piece of kit.

Not only can NetApp HCI talk the talk, it can walk the walk. So, let’s shake things up let’s go out there to disrupt the norm and what people think infrastructure can do, let’s be bold and confident in a platform that delivers a plethora of abilities and let’s change what the HCI initialism means from Hyper Converged Infrastructure to Hybrid Cloud Infrastructure.

Advertisements

Take a Second Look at NetApp HCI It Will Surprise You Part 2

In part one we discussed the HCI marketplace and how NetApp HCI differs from other vendors in this space. In this article we will look at some of the other functionality NetApp HCI brings to the conversation, but first let’s look at private cloud.

Private Cloud

As cloud services from the likes of AWS started to take off and the problem of shadow IT emerged, application admins and developers started asking for the similar as a Service infrastructure available within an organisation to try and deliver their project correctly, this in turn made the infrastructure teams look at bimodal IT practices and start transitioning over to the mode 2 way of deploying applications. With this we see a migration towards private cloud for more and more organisations as they implement technology that allows them to provide IT resources that can be consumed at will, closely monitored, controlled by code. This ability to automate tasks is key to transitioning to a private cloud from your current server room.

To help IT organisations realise this vision, with VMware integration there is the ability to implement vRealize Suite. vRealize Operations management pack provides insights detailed monitoring and reporting; vRealize Orchestrator so you can really start to utilise the deep API integration and define workflows to heavily automate the environment. Then there is the Storage Replication Adapter for Site Recovery Manager to simplify the process in the event of a disaster. You can implement VMware’s NSX technology to give you the ability to define and control virtual networks and their security that comprise completely of software. It is worth noting that if you prefer to use Powershell, Java, Python, Ansible, Puppet or Chef for your automation you are free to do that too.

So if you have a NetApp HCI environment strengthened by VMware’s vRealize suite and then by presenting a self-service portal backed by a catalogue of services you can deploy couple, this with the ability to monitor the usage and provide a charge back mechanism you can find yourself transitioning from being pools of compute or pools of capacity to infrastructure as a service (IaaS) or platform as a service (PaaS). Which can be a key part of a digital transformation strategy; but that’s a topic for another day.

Functionality

Sometimes your IT ecosystem has extra requirements and there are several of note that NetApp HCI can help you address on top of providing a best in class platform for your virtual environment.

If your application(s) requires some form of NAS then don’t forget that the NetApp Deployment Engine can setup a virtual node of NetApp Select for vNAS functionality in your vSphere environment providing you with practically all that functionality that ONTAP brings to table.

Now what if you were to stick Red Hat on top of the VMware and then utilise Kubernetes along with the NetApp Trident Orchestrator you suddenly find yourself in an environment that can run microservices with persistent storage, now if you are going to be running hundreds or even thousands of microservices you will most definitely want a way to control their performance via gQoS and this is another strong case to utilise NetApp HCI.

Another functionality of note is the fact that with NetApp HCI you can exploit what NetApp are calling the Open Storage Model, whereby if you have a requirement for some storage by an application that resides outside of the NetApp HCI environment and you have the capacity and performance available; you can connect these other compute platforms to the required storage.

Data Fabric

The Data Fabric is NetApp’s vision for the future of data management whether that be using on-premises public or hybrid cloud the data fabric ensures the right cloud for the right workload. It allows for consistent and integrated data services gives you the power to control access wrap security and gain deep insights into your organisations data.

One of the key areas of the Data Fabric is the ability to move your data to where it is best suited for the application and at the core of this are couple of NetApp technologies. The primary one would be the worlds number one storage operating system, ONTAP’s replication technology -SnapMirror. Originally designed to help with data protection and disaster recovery this technology provides the ability to transfer data from not only systems running ONTAP like AFF, FAS, ONTAP Select and Cloud Volumes ONTAP but also to systems now running Element OS (the operating system of the storage within NetApp HCI) in bi-directional relationships. A second major feature of the Data Fabric transfer is the use of FabricPool. This tiering technology allows the ONTAP system to granularity transfer cold and or secondary data off this performance tier to an S3 connected bucket, the capacity tier, whether that be provided by NetApp’s content repository software StorageGRID or from AWS and Azure.

Join us for the third and final part of this series where we will be looking at the main reasons why NetApp HCI differs from the rest of the players in this market space.

Take a Second Look at NetApp HCI It Will Surprise You Part 1

Defining HCI

It is just over a year since NetApp decided to use its strengths and pedigree from the storage industry to enter the Hyper-Converged Infrastructure (HCI) marketplace. This is an area of IT spending where the global HCI market is now at $3 billion in annual revenue, and it’s expected to grow to $7 billion by 2020. IDC estimates an 86% compound annual growth rate and with the quantities of data being moved to or created on these platforms it was eventually going to be on the radar for the major IT vendors. With Dell EMC, HPE, VMware and Microsoft entering the market along with several start-ups and even Cisco wanting to gain a foot hold in this space it was only a matter of time before NetApp appeared on the scene. From a hardware perspective they initially released three varying designs of compute and similarly three designs of storage units to enter into the market, and they have recently refreshed both lines and now delivering over 10 variants of compute to the specifications that customers are asking for. But there is more to this part of their portfolio than just various pieces of hardware and for one reason or another it has the other vendors in this space running scared.

Why it’s different

For the savvy HCI reader amongst you, you may have already spotted a major difference when it comes to NetApp’s HCI portfolio; the separation of compute and storage onto separate units or nodes. As NetApp were late to this market segment it gave them the unique ability to look at what others had done before them seeing what worked and what didn’t. One thing they did notice was the large number of key players who had first generation HCI products had achieved simplicity by utilising a shared core approach i.e. sharing CPU to do both compute and storage tasks. To achieve this simplicity these vendors made architectural and design compromises that created performance and flexibility limitations, technical debt if you will.

These first generation HCI vendors combined compute and storage into a single unit with fixed scaling ratios and this simplicity doesn’t lend its self well to flexibility, you are therefore forced to purchase unnecessary compute and storage when you need only increase one of these resources. In this scenario you can end up with unconsumed or worse still isolated resources; and we refer to this as the Resource Tax you are forced to pay. Combine this with the fact that you need to purchase additional hardware as many first generation HCI require controller virtual machines (VMs) per node and their resource utilisation could reach as high as 30% which is especially noticeable when you had a smaller installation footprint. This is called the Shared Core Tax. Certain applications also require per socket licensing which can be costly in a virtual environment but you are paying for the flexibility and resiliency you gain from abstracting the hardware from the application; but if you add in the fact mentioned above whereby you can loose up to 30% of your compute resources just to keep the environment functioning you end up paying excessive licensing costs. This is known as the Software License Tax. In a first generation HCI environment these taxes mount up faster than the Sherriff of Nottingham cancelling Christmas. NetApp feel this is unfair to customers.

Next Generation

NetApp took the decision to go to market with separate compute and storage nodes so that customers have the ability to independently scale the resources as and when they require. This fresh and unique methodology was able to set NetApp apart from the other vendors.

One of the things that HCI is supposed to address is virtual machine consolidation, giving customers the ability to gain greater numbers of VMs running on a single piece of hardware. The recent iteration of CPUs have massive core counts and support RAM into the terabyte space, yet many of the gen one HCI vendors still have limitations on the number of VMs that they can run on a single compute node (even when you ignore the Shared Core Tax). With NetApp HCI this hurdle can be easily cleared because of the integration of their best in breed guaranteed quality of service (gQoS) on the storage. If an application requires a specific IO profile this can be defined and subtracted from the clusters known overall performance maximum.

So that’s it for part one of this three part series. In Part two we will delve into the world of Private cloud and where NetApp HCI fits into this. We will also be looking at some of the other functionality that NetApp HCI so keep a look at for that.

It takes a Village

The above is a favourite  saying by a friend of mine and I think it rings true more than we know. In todays modern society I think we work together and use other’s help and knowledge more and more on a daily basis. I for one know that if I can’t perform something I go looking for other blog posts or even YouTube videos on the topic of choice for advice. This could be repairing washing machine handles or how to get your Soufflés to rise; someone out there has shared a prized piece of know-how to complete the task at hand.

In the IT community the idea of working together is still alive and well. Whether you are on the network team, the virtualisation team or maybe you are a DevOps team, maybe you follow Jeff Bezos’ two pizza rule. The point of working together to better the environment has never been more true. When I first got started with NetApp I probably had more questions than answers and thankfully NetApp has a website that helped – the NetApp Community Site.

The new landing page

One of the best things about this site (and others like it) is that it puts you in touch with literally thousands of users with varying skills and levels of knowledge. People scattered around the globe in different time zones only too happy to help. One of the reasons I like working with NetApp so much is that if I had a problem or an issue I know that If I go and post something on the community site, someone somewhere would help regardless of whether they worked for a customer, partner or NetApp.

Getting to the topics of interest

The Community site has recently gone through a face lift and its new and improved user interface looking fresh and straightforward to navigate. You can easily get into sections devoted to your favourite subject be that Flash and NVMe, Python Developer discussions or topics on the newly updated NetApp U courses and exams; you can find someone to converse with. You can either search for a topic or start a new discussion effortlessly from the home page; an effort to help those who need it as quickly as possible. It also is a nice place to access blog posts. Whether that be on the official NetApp blog site or something created by the community it’s a great location to gain a distilled look at the current topics of discussion. So I would urge you if you haven’t had a look for a while then check out the new and improved, version 2.0 NetApp Community Site and who knows maybe you have the knowledge that could help out someone in need.

Oh, and on the Soufflés, it’s make sure you don’t overfold your egg whites into the base.

Making time for Insight

With Insight US just a few weeks away and as I have still to complete my session calendar for the event, I thought this would be a good time to highlight those sessions that have stood out for me in the hope that it may help you in making a decision. With 307 sessions 238 speakers and 45 exhibitors how do you distil this down to give you something manageable and meaningful?

Now whilst I normally spend more time on a particular track, it is worth asking yourself “What am I going to take home from this conference?” Now are you here to just get as much information as possible or are you here to get skilled up on a particular topic, either for an upcoming project or to break into a new area of business. This is probably something you should decide before you start going hell for leather and filling your calendar with random topics like FlexGroup (is that even a thing?).

On my first pass over the catalogue I had 38 interests which is way too many for even your most hardcore conference attendee to attend so some culling will need to be done. One thing that does bother me and probably every conference attendee, are the time slots where you have 10 interests happening at the same time and the 2 hours prior you have a big blank hole. Thankfully vendors have started to record sessions at conferences for that very reason so for those that you cannot make there is always the ability to review at some other point plus some sessions just hit you with way too much to take in so you may need to hear them a second time.

Cloud Volumes is probably going to be the hot topic this year and with 50 sessions to choose from there’s plenty on offer. The first thing I would suggest is to verify its either a Cloud Volumes ONTAP (formerly known as ONTAP Cloud) or whether it’s a Cloud Volumes Service (NFSaaS) session you have picked and attending so you get the correct information. I’m sure there will be a few people that will get this wrong this year and you don’t want to be one of them.

1228-2 Designing and deploying a Hybrid Cloud with NetApp and VMware Cloud on AWS presented by Chris Gebhardt and Glenn Sizemore is I’m sure to be a popular session and it will hopefully build on the session NetApp gave at Tech Field Day at VMworld US last month.

1261-2 NetApp Cloud Volumes Service Technical Deep Dive presented by Will Stowe is probably going to be one of those sessions people leave and mention to others you need to see. With a huge potential Cloud Volume Service will become integral into many customers data fabric over the coming year So I’d advise getting skilled up on this as soon as you can.

If you are new to all things cloud and wondering where might be a good place to start, then schedule 4117-1 Cloud Volumes Service and 4118-1 Cloud Volumes ONTAP sessions in the Data Visionary theatre at Insight Central to give you a good idea on these two technologies.

Another product name change Cloud Control is out and been rebranded NetApp SaaS Backup; but there is a lot more this SaaS suite offers so with that piece of knowledge there is a session on One Stop Backup for SalesForce 1121-2 and then head on over to 1188-2 for NetApp SaaS Backup for Office 365 to complete the picture.

With Security being a major focus in the IT industry as a whole there are several sessions of note on this subject. 1234-2 – Data Security at NetApp: An Overview of the NetApp Portfolio of Security Solutions by Juan Mojica would be an excellent place to start if you haven’t thought about how begin with such a huge undertaking.

You may want to follow that up with 1103-2 – Securing and Hardening NetApp ONTAP 9 with Andrae Middleton. Remember security teams need to get policies and procedures right 100% of the time, hackers need to only get it right once.

1214-2 What’s On Tap in the Next Major Release of NetApp ONTAP by surviving podcast host Justin Parisi (It has been a bit Hunger Games/Highlander on the tech ONTAP podcast recently) will no doubt fill up fast as any new OS payload draws in the crowd and the Q&A after that session may spill out into the halls.

1136-3 will also be popular as it covers the advancements made with SnapMirror covering best practices for both Flash and Cloud worlds.

It also looks like some of the sponsors have upped their game as well with some excellent sessions. Veeam for instance have 6 sessions to choose from which is great as they are now on the NetApp price book. 9107-2 – Veeam: Veeam Data Availability Deep Dive—Exploring Data Fabric Integrations presented by Michael Cade and Adam Bergh will highlight just some of the great reasons why Veeam have been added to the price book, and then head over to the hands-on labs as Veeam has made it into the Lab On Demand catalogue. Veeam also have a data exchange white board session 8102-1 – Veeam: Availability Outside the Datacenter: Public Cloud & Veeam Availability Suite 9.5 Update 4 which some of you keen eyed people may have noticed will include some information about the anticipated upcoming update 4.

For those of you who like your speed turned up to eleven then you may want to attend 9126-1 – Intel® Optane™ Memory Solutions. I would be remised if I didn’t mention my colleagues at Arrow with their 9112-2 – Arrow: From IoT to IT, Arrow Electronics Is Accelerating your Digital Transformation looking at how to deliver and scale an IT infrastructure to meet the challenges of deploying IoT solutions.

There are also the certification prep sessions and with the release of two new Hybrid cloud certifications by NetApp U recently sessions 1279-1 & 1280-1 will no doubt draw in a crowd, so if you are planning on having a go make sure to get these in your diary, and I may bump in to you there as my Hybrid cloud certification achieved at Insight two years ago is up for renewal.

Now whilst this list is my picks I would suggest you spend a bit of time ahead of going and populate your calendar with the topics you want to hear and do it sooner rather than later, so you can get on the list before the session fills up. Just remember though pretty much all sessions are repeated during the conference and spend some time at insight central as an hour there can be just as beneficial as a session but most of all enjoy yourself. I would strongly suggest you follow the A-team members on twitter for an up to the moment review of sessions and whether catching the second running is worth you amending your calendar; and before you start filling the comments section with “Duh – FlexGroup is a hugely scalable container of NAS storage that can grow to trillions of files and yottabytes of storage there is a session on the topic number 1255-2 FlexGroup: The Foundation of the Next generation NetApp scale-out NAS.

Expanding NetApp HCI

NetApp recently updated the version of their HCI deployment software to v1.31. This version contained several new features to help in deploying a NetApp HCI environment. It’s been several months since I initially deployed our demo kit, and I felt it was time to revisit this process and see what has changed.

One welcomed new feature is the removal of the reliance on having a DHCP server that covers both your 1Gbe management and 10/25Gbe data networks. Whist this is a nice idea to help you get up and running and is something easy to configure in the lab, having DHCP running within a production SAN is not exactly common practice. You could either set one up or spend time configuring static addresses, which could be time-consuming, especially if you had half a dozen or so blades.

The other new feature that caught my eye was the ability to use the NetApp Deployment Engine (NDE) to expand a NetApp HCI environment. As previously mentioned in an earlier post and video (here), adding a SolidFire storage node to an existing cluster is quite easy (in fact, it was a design methodology when they created Element OS), but adding an ESXi node is quite a labour-intensive task. It is great to see that you can now add these quickly through a wizard.

To start the expand process, simply point your browser to the following:

https://storage_node_management_ip:442/scale/welcome
where you are greeted by the following landing page:

As you can see, it wants you to log into your environment. You may notice NetApp have updated the text box to show the password once typed as you can see from the eye icon at the end of the line.

To test this new methodology instead of buying more nodes, (which would have been nice) I removed both a single storage and compute node from their respective clusters and factory reset them. This allows me to test not only the addition of new nodes into existing clusters but also the removal of the DHCP or static IP addressing requirements before deployment.

Once logged in the NDE scale process discovers any and all nodes available and is where you can select which of these you would like to add to your environment.

After agreeing to the VMware EULA, you are asked to provide the VC’s details and then to select the datacentre and cluster you wish to add the node to. These steps are only present if you are adding compute nodes.

After giving the compute node a root password, you are taken to the “Enter the IP and naming details” page.

Finally, NDE scale takes you on to a review screen as these three screenshots (headings fully expanded for visibility) show.

Once reviewed, click the blue “Add Nodes” button. This initialises the now familiar NDE process of setting up NetApp HCI that can be tracked via a progress screen.

The scaling process for the addition of one compute and one storage node took just under half an hour to complete. But the real benefit is the fact that this scaling wizard can set up the ESXi host plus networking and vSwitches as per NetApp HCI’s best practices whilst at the same time adding a storage node into the cluster. That isn’t the quickest thing to do manually, so having a process that does this for you speedily is a huge plus in NetApp’s favour especially if you have multiple hosts. It’s clear to see the influence that the SolidFire team had in this update, with the ease and speed in allowing customers the ability to expand their NetApp HCI environments with NDE scale. I look forward to the features that will be included in upcoming releases of NetApp HCI and if hyperconverged infrastructure is all about speed and scale then this update gives me both in spades.

VMC NetApp Storage

Last week at VMworld, NetApp announced a new partnership offering with VMware whereby VMware Cloud on AWS (VMC) would be able to utilise NetApp Cloud Volumes Service. Currently in tech preview, let’s take a look at these two technologies and see how they can work together.

VMware Cloud on AWS

Firstly, let’s review the VMware cloud offering. The ability to run vSphere virtualised machines on AWS hardware was announced at VMworld 2017 and was met with great approval. The ability to have both your on-premises and public cloud offerings with the same abilities and look and feel was heralded as a lower entry point for those customers who were struggling with utilising the public cloud. The VMware Cloud Foundation suite (vSphere, vCenter, vSAN, and NSX) running on AWS EC2 infrastructure is now available, but it is sold, delivered, and supported by VMware.

There are several advantages with this:

  • Seamless portability of workloads from on-premises datacentres to the cloud
  • Operation consistency between on-premises and the cloud
  • The ability to access other native AWS services, not to mention the fact that AWS data centres appear around the globe
  • On-demand flexibility of being able to run in the cloud

With VMware running the suite themselves rather than informing customers how to deploy, set up, and run it, a customer could be ordering and utilising a new vSphere offering within an hour. With VMC, the customer has the choice of where to run their workload, with the flexibility to migrate it back and forth between their private data centre and AWS with ease.

Cloud Volumes Service

When NetApp moved into the cloud market several years ago, their first offering was the ability to run a fully-functioning ONTAP virtual appliance on AWS (later available on Azure). This offering, originally called Cloud ONTAP then ONTAP Cloud and more recently renamed Cloud Volumes ONTAP (CVO), is a cloud instance you spin up, set up, and manage like a physical box, with all the features you have come to love on that physical box, whether that be storage efficiencies, FlexClone, SnapMirror, or multi-protocol access. It was all baked in there for a customer to turn on and use.

More recently, NetApp has launched Cloud Volume Service (CVS). This service is sold, operated, and supported by NetApp, providing on-demand capacity and flexible consumption, with a mount point and the ability to take snapshots. It is available for AWS, Azure, and the Google Cloud Platform. The idea behind Cloud Volumes Service is simple: you let NetApp manage the storage, so you can concentrate on getting your product to market faster. Cloud Volumes Service gives you the file-level access to capacity required with a given service level in seconds. It also comes with the ability to clone quickly and replicate cross-region if required whilst providing always-on encryption at rest. That’s why over 300,000 people use NetApp Cloud Volumes Service already.

There are three available service levels: Standard, Premium, and Extreme with ranging performance of 16, 64, or 128KB per quota GB (these are levels, not guarantees).

(Example pricing as of 10 July 18) https://docs.netapp.com/us-en/cloud_volumes/aws/reference_selecting_service_level_and_quota.html

With the three different performance levels at varying capacities, you can mix and match to meet your requirements. For example, let’s say your application requires 12 TB of capacity and 800 MB/s of peak bandwidth. Although the Extreme service level can meet the demands of the application at the 12 TB mark, it is more cost-effective to select 13 TB at the Premium service level.


Partnership

Let’s take a look at the options that we now have. We have NetApp Private Storage (NPS), where a customer owns, manages, and supports a FAS system in a datacentre connected to AWS via a dedicated Direct Connect. We have the ability to deploy an instance of Cloud Volumes ONTAP from the AWS marketplace which the customer manages and connects to the infrastructure via an elastic network interface (ENI). Or we have the Cloud Volumes Service provided and managed by NetApp, connected to AWS via a shared Direct Connect. All three of these can be utilised to connect to VMC on AWS. These currently supported configurations have the guest connected using iSCSI, NFS, and/or SMB via Cloud Volumes Service, Cloud Volumes ONTAP, and NPS.

This current use case available to all is where the Guest OS would access storage via iSCSI, SMB, and or NFS using CVO. With no ingress or egress charges within the same availability zone and the ability to use the Cloud Volumes ONTAP data management capabilities, this is a very attractive offering to many customers. But what if you wanted to take that further than just the application layer? This is what was announced last week.

This announcement is for a tech preview of datastore support via NFS with Cloud Volumes Service. This is a big move. Up to this point, datastores were provided via VMware’s own technology, vSAN. By using CVS with VMC, you are gaining the ability to manage both the compute and the storage as if it were on the premises, not where it exists in the cloud.

As you can see, Cloud Volumes Service is supplying an NFS v3 mount to the VMC environment.

As this is an NFS mount from an ONTAP environment with no extra configuration, you can gain access to the snapshot directory.

Moving forward, VMC will be able to access NetApp Private Storage to provide NFS datastores, allowing customers to keep ownership of their data whilst also allowing them to meet any regulatory requirements. In the future, Cloud Volumes ONTAP will be able to provide NFS datastores to a VMC environment. There are several major use cases for cloud in general, and VMC with Cloud Volumes provides increased functionality to all these areas, whether that be disaster recovery, cloud burst, etc. The ability to provide NFS and SMB access with independent storage scale backed by ONTAP is a very strong message.

If you are considering VMC, this is a strong reason to look at Cloud Volumes to supply your datastores and decouple their persistent storage requirements from their cloud consumption requirements or exceed what vSAN can do.

Gain Some IQ on AI

Today (1/8/18) NetApp announced a new partnership with NVIDIA and launched the NetApp ONTAP AI Proven Architecture. This strengthens their already growing foothold in this new and exciting branch of the IT industry and after what was announced today, ONTAP AI is surely going to have everyone talking. This meet in the channel play gives data scientists a proven architecture to use in their data pipeline for deep learning, avoiding design guesswork and allows for fast efficient deployments of AI environments.

Machine learning (ML) and artificial intelligence (AI) have some very unique demands from an IT perspective. Firstly, they both have a demand for huge amounts of information; a capacity requirement that is constantly growing. Second, they require that storage to respond with an ultra-low latency. Unlike big data you need to keep all the data generated and not burn the hay to find the needle so expandability over time is a must. And finally, the type of computation that they undertake is more suited to a GPU rather than a CPU.

Now whether you would class this as a modernise your infrastructure or a next generation Data Centre play one thing is certain this is definitely cutting-edge equipment. For example by using one NVIDIA DGX-1 is equivalent to replacing 400 traditional servers and if you look at Gartner’s top 10 picks for 2018 and beyond, the majority of these have an aspect of AI/ML in there so it’s probably only natural that we are seeing IT vendors moving into this space.

NetApp are announcing the ability to combine an AFF800A their flagship All Flash Array with Five NVIDIA DGX-1 with Tesla V100’s tied together over 100Gbe with a pair of nexus 3232C’s from Cisco which equates to 5000TFLOPS

Whilst the messaging around this offering highlights it as a future-proof play you don’t need to buy everything in one go; but instead build upon NetApp’s key messages of flexibility and scaling. But if you were to plan ahead or really did need to start big, there is no reason you could not have a twelve High Availability pair with sixty (60x) DGX-1 with close to 75PB. There is also no reason you couldn’t implement a data pipeline with an A700s or even A300 or A220 it all really depends on what performance and scalability you require. Tie this together with edge devices running ONTAP select for data ingest and then the ability to use Cloud Volumes ONTAP in the AWS or Azure or possibly FabricPool for an archival tier you can truly see why integrating the Data Fabric into this story is such a nice fit. Just imagine adding MAX Data into this mix and it will be like strapping on two F9 first stage boosters to this already Full Thrust rocket.

Now you may be thinking this is super computer niche corner case but in reality, it is being utilised in pretty much every industry vertical affecting almost every aspect of our daily lives from the finance industry to heath, automotive, retail, agriculture, oil and gas and even legal industries to name a few are already seeing a surge in software and companies dedicated to this as a way of doing business. We have the horror stories of Facebook and no doubt you have invested in one of the big three home automation voice recognition featuring Alexa, Siri or Assistant. Maybe you have travelled using Uber or Tesla’s autopilot or even Waze on your phone. Maybe you have a hobby like flying drones from DJI or utilise 3DR’s software, or you can’t work out without your Fitbit or Fenix, the point is you are providing data back to some central point that is analysed to give the company better decisions as to what to proceed to market with as a next generation product or where to improve something already in the field. Whist the luddites worry that AI will lead to Skynet and the doom of humanity it is probably better think of it in an advancement in human intelligence and another milestone down the path of evolution, and I look forward to seeing how this architecture develops.

NetApp – Not Just Surviving but Thriving

When you’re a company that has been around for over 25 years, some people might look at you like a dinosaur, slowly plodding along as the end of the world as you know it approaches. A lot of press has been made in recent years that puts NetApp in this light. Some have said that NetApp has just been plodding along, not in touch with the industry or its customers’ needs.

Yet in the last 6 months, this “dinosaur” has started to show its teeth. The stock price has gone from $37.43 in September to a high of $71.41, and with the announcements made yesterday, you can expect that to go higher.

With the newly announced AFF A800, NetApp is now able to provide sub 200 µs latency for workloads that have the most demanding data needs. That’s an order of magnitude better than previous generations!

Not only is the AFF A800 blazingly fast, it can handle huge amounts of traffic with 25GB/s throughput on an HA pair and the ability to have NVMe end to end from the server to the storage via NVMe over FC. If using 32 or 16Gb FC isn’t a requirement you can use ethernet speeds of 100Gbe, another industry first made by NetApp. With 12 pairs clustered together, you are talking 300GB/s throughput on a single management domain. That should meet the most demanding environments.

With a current run rate of $2.0B for their all flash business, having already shipped over 20PB of NVME, and with a 44% Petabyte year on year growth, NetApp’s flash business is not only going to increase in size in the future, but with numbers like this it will survive any extinction event.

But the announcements made yesterday are not just about end-to-end NVMe-accelerated performance. There were also more advanced cloud integration messages.

NetApp’s cloud strategy is geared towards enabling customers to deliver business outcomes for all IT workloads in cloud, multi-cloud, and hybrid cloud environments. To do this, you must modernise your data management from the edge, to the core, and to the cloud.

Fabric Pool is just one of the features designed to help you do just that. Fabric Pool enables automatic tiering of cold data, which means you can purchase a smaller system or get an even higher amount of consolidation on a single box. With the release of ONTAP 9.4, Fabric Pool has been improved to allow Azure as a capacity tier and ONTAP Select as a performance tier. It can now also tier from the active primary data set, which is something I am looking forward to testing soon.

So when you look at these and other announcements that NetApp made yesterday, if they are a “dinosaur,” I would put them in the meat-eating Velociraptor camp. And that’s one dinosaur you do not want to take your eye off.

Setting up FabricPool

Recently, I was lucky enough to get the chance to spend a bit of time configuring FabricPool on a NetApp AFF A300. FabricPool is a feature that was introduced with ONTAP 9.2 that gives you the ability to utilise an S3 bucket as an extension of an all-flash aggregate. It is categorised as a storage tier, but it also has some interesting features. You can add a storage bucket from either AWS’s S3 service or from NetApp’s StorageGRID Webscale (SGWS) content repository. An aggregate can only be connected to one bucket at a time, but one bucket can serve multiple aggregates. Just remember that once an aggregate is attached to an S3 bucket it cannot be detached.

This functionality doesn’t just work across the whole of the aggregate—it is more granularly configured, drawing from the heritage of technologies like Flash Cache and Flash Pool. You assign a policy to each volume on how it utilises this new feature. A volume can have one of three policies: Snapshot-only, which is the default, allows cold data to be tiered off of the performance tier (flash) to the capacity tier (S3); None, where no data is tiered; or Backup, which transfers all the user data within a data protection volume to the bucket. Cold data is user data within the snapshot copy that hasn’t existed within the active file system for more than 48 hours. A volume can have its storage tier policy changed at any time when it exists within a FabricPool aggregate, and you can assign a policy to a volume that is being moved into a FabricPool aggregate (if you don’t want the default).

AFF systems come with a 10TB FabricPool license for using AWS S3. Additional capacity can be purchased as required and applied to all nodes within cluster. If you want to use SGWS, no license is required. With this release, there are also some limitations as to what features and functionality you can use in conjunction with FabricPool. FlexArray, FlexGroup, MetroCluster, SnapLock, ONTAP Select, SyncMirror, SVM DR, Infinite Volumes, NDMP SMTape or dump backups, and the Auto Balance functionality are not supported.

FabricPool Setup

There is some pre-deployment work that needs to be done in AWS to enable FabricPool to tier to an AWS S3 bucket.

First, set up the S3 bucket.

Next, set up a user account that can connect to the bucket.

Make sure to save the credentials, otherwise you will need to create another account as the password cannot be obtained again.

Finally, make sure you have set up an intercluster LIF on a 10GbE port for the AFF to communicate to the cloud.

Now, it’s FabricPool time!

Install the NetApp License File (NLF) required to allow FabricPool to utilise AWS.

Now you’ll do the actual configuration of FabricPool. This is done on the aggregate via the Storage Tiers sub menu item from the ONTAP 9.3 System Manager as shown below. Click Add External Capacity Tier.

Next, you need to populate the fields relating to the S3 bucket with the ID key and bucket name as per the setup above.

Set up the volumes if required. As you can see, the default of Snapshot-Only is active on the four volumes. You could (if you wanted) select the individual or a group of volumes that you wanted to alter the policy on in a single bulk operation via the dropdown button on top of the volumes table.

Hit Save. If your routes to the outside world are configured correctly, then you are finished!

You will probably want to monitor the space savings and tiering, and you can see from this image that the external capacity tier is showing up under Add-on Features Enabled (as this is just after setup, the information is still populating).

There you have it! You have successfully added a capacity tier to an AFF system. If the aggregate was over 50% full (otherwise why would you want to tier it off?), after 48 hours of no activity on snapshot data, it will start to filter out to the cloud. I have shown the steps here via the System Manager GUI, but it is also possible to complete this process via the CLI and probably even via API calls, but I have yet to look in to this.

One thing to note is that whilst this is a great way to get more out of an AFF investment, this is a tiering process, and your data should also be backed up as the metadata stays on the performance tier (remember the 3-2-1 rule). So, when you are next proposing an AFF or an all flash aggregate on a 9.2 or above ONTAP cluster; then consider using this pretty neat feature to get even more capacity out of your storage system or what I like to now call your data fabric platform.