The three things that large enterprises want out of hyperconverged storage, and that EMC will be bringing to bear with its ScaleIO server-SAN hybrid, are scalability, performance, and ease of consumption. The company is previewing preconfigured ScaleIO appliances, due to ship early next year, that the company says will be suitable for large enterprises that have been, for the most part, hesitant to invest in this fast-growing part of the storage market.
EMC is, of course, the dominant seller of hardware-based SANs and it has no desire to stop selling its VMAX arrays to customers who want to scale out their machines. Additionally, VMware, of which EMC has the majority stake, has been extending the scale and scope of its Virtual SAN (VSAN) software, which rides atop its ESXi hypervisor and provides a virtualized clustered SAN across the nodes in the compute cluster. VMware has been very careful to position VSAN against rival upstarts like Nutanix, SimpliVity, Hewlett-Packard, Pivot3, Scale Computing, and a few others, and recently launched a hyperconverged appliance called EVO SDDC that mashes up ESXi server virtualization, VSAN storage virtualization, and NXS network virtualization and that scales across up to eight racks for a total of around 8,000 server-class VMs in a single domain. VSAN is capped at 64 nodes in a single domain – although we have heard reports that some customers have been allowed to push above 100 nodes – and for some large enterprises, this scale does not suit for their block storage.
With its ScaleIO hyperconverged storage, EMC is trying to create a product that does something a little different from either VMAX or VSAN while at the same time giving it something that it can sell against the hyperconverged storage upstarts in the large enterprise datacenter. As The Next Platform has previously reported, plenty of people think that over the long haul that traditional SAN hardware will be replaced by generic storage servers running software much like EMC ScaleIO, Nutanix Xtreme Computing Platform, HP StoreVirtual, and so on, and the analysts at Wikibon put some numbers on its prognostications and showing the rise and domination of server SAN hybrids and, importantly, flash as the storage device, in the next decade. But these transitions do not happen overnight. Despite various kinds of convergence, companies still have server, storage, and networking teams that operate independently (but probably less so than in the past), and they need to feel comfortable with any new technology before throwing themselves completely behind it. It took a long time to convince companies to rip their storage out of servers and share it with SANs, and it will take a similarly long time to convince companies to use a different kind of SAN – unless they are under tremendous pressure to deliver scalable SANs at a much lower cost, that is.
Such was the case at Swisscom, the formerly state-owned telecom company in Switzerland, Jyothi Swaroop, head of product marketing for the ScaleIO product at EMC. Swisscom wanted more scalable and less expensive block storage for the private cloud it uses to create and run its internal applications for its telecom and cable subscribers, and it has grown its ScaleIO cluster over the past several years to 7 PB and is planning to push it even further.
“Most telecom and service provider customers are under tremendous pressure, and if they can’t scale their infrastructure cost effectively on the back end, they won’t be able to provide services to their customers affordably,” Swaroop says. “They coming to us and saying that they like Amazon Web Services or Microsoft Azure because of the agility it gives them – they swipe a card and they get the storage. They want similar agility in their private cloud so they don’t have to move their applications and they will do it in-house, whether it is for internal-facing applications or cloud services. We can run ScaleIO on commodity hardware internally for them and give them the elasticity they need.”
The big issue with telcos and service providers is not just the ability to scale out their server SAN storage and to get high performance at a reasonable cost, but to get out of the nightmare of having to do capacity planning on traditional hardware-based SANs.
“Nearly all of the enterprise customers that use ScaleIO were struggling with capacity planning with their existing SAN infrastructure,” Swaroop continues. “They bought a ton of SANs and there are forklift refreshes every three years where they have to spend millions of dollars, and they don’t come cheap. And they always have to buy more SAN because they can’t predict how much storage they will need going forward. Capacity planning has been a huge challenge, and with ScaleIO, that goes away.”
The reason is simple: ScaleIO can be configured with as few as three storage server nodes to start and grow to over 1,000 nodes in its current incarnation. ScaleIO currently has over 250 customers using ScaleIO today, and has not been pushing it as aggressively as it might have otherwise because it was building a product that is rugged enough for the Global 2000, says Swaroop. EMC is probably not in a hurry to have traditional SANs replaced by either ScaleIO or VSAN, but it is far better to do the replacing across EMC products than to lose the storage deal to a hyperconverged storage upstart. Many of those current ScaleIO customers have hundreds of nodes deployed and are pushing up into the multi-petabyte range with their storage capacity, which is pretty large for enterprises.
While EMC is not putting out pricing information for the ScaleIO software or the ScaleIO node appliances that it plans to start shipping in the first quarter of 2016, Swaroop does provide some sense of the relative costs between traditional SANs and ScaleIO clusters. The company did a survey of customers who started with traditional SANs from a multitude of vendors (including but not limited to EMC itself), and averaged across those customers, a ScaleIO setup of equivalent capacity was about 30 percent less costly. It doesn’t hurt that the ScaleIO storage can be run on bare metal machines and used to create a clustered SAN appliance that any machine can reach over the network as well as a hyperconverged server-SAN cluster that can run compute jobs on the same nodes as servers, using VMware ESXi, Microsoft Hyper-V, or Red Hat KVM hypervisors on the nodes; customers can mix hypervisors across the cluster, too. Add in the ability to scale out the storage easily, eliminate overbuying capacity on the front end with SANs, and you can see the sales pitch the ScaleIO team will be making.
The ScaleIO Node Appliance
While Nutanix is inching its way from just selling appliances to eventually offering its compute-storage platform in a software-only form factor that will, we think, eventually be available on the most popular server platforms in the enterprise, ScaleIO is going in the other direction and moving from software-only to appliances. The reason is that large enterprises were asking EMC to provide a complete system that they could buy and install easily without thinking about it. The resulting servers bear the EMC badge and are actually manufactured by EMC, by the way. These are not OEM or ODM machines, and as for pricing, Swaroop says that they will cost the same as commodity servers running ScaleIO software because EMC knows it cannot charge a premium for its iron over that of other server makers.
This is all about making the sale easier and getting ScaleIO into more datacenters.
To that end, the ScaleIO Node appliances will be offered in 18 different configurations, suitable for different workloads that can broadly be characterized as one set aimed at storage capacity and the other aimed at extreme performance. The systems will have a varying amount of compute and a mix of disk, flash SDD, and PCI-Express flash card capacity, which is tuned to yield the different capacity and performance points in the ScaleIO Node product line.
Here are the feeds and speeds of the capacity nodes:
And here are the specs for the performance nodes:
To link the nodes together, EMC is using 10 Gb/sec switches from either Cisco Systems or Arista Networks, or customers can provide their own if they have another switch vendor they prefer.
Depending on the hardware in the box, a ScaleIO Node will deliver anywhere from 2,000 IOPS to 110,000 IOPS per node. Those specs are for 4 KB file sizes with 70 percent read and 30 percent write ratios and with a 50 percent hit on the SSD flash if it is present on the machines. (Some of the capacity configurations do not have flash, hence their low IOPS ratings.)
In benchmark tests by Enterprise Strategy Group, three-node and six-node configurations of ScaleIO using Cisco UCS C240 M3S servers and Micron Technology P320 PCI-Express flash cards, and the system was able to deliver around 220,000 IOPS per node. Extrapolating up to 128 nodes, which ESG said was reasonable, gave a total throughput of around 24 million IOPS with the 70 percent read, 30 percent write mix on 4 KB files. Interestingly, ESG said that using its own tools for calculating total cost of ownership, EMC could demonstrate savings of 60 percent compared to traditional hardware-based SANs. How relevant the configuration tested by ESG was is debatable, since each server node had only one flash card in the benchmark.
It will be interesting to see how the capacity and performance configurations of the Scale IO Node appliance do on similar tests. Generally, what EMC is saying so far is that a ScaleIO Node cluster can deliver 10 million IOPS with sub-millisecond response time, and that a 500-node setup can deliver 100 million IOPS. We would love to see the details on these tests, and have asked for them.
We expect that all-flash, high performance configurations of the ScaleIO Node appliances will be quite a bit more expensive than the all-disk, capacity variants, but it is hard to say what the prices will be until the products are launched early next year and in the field for a while.