So 2016 is the year, or at least it is supposed to be. The year when 64-bit ARM chips finally make their way into servers and perhaps start getting wheeled through the loading bays of actual datacenters to start running real workloads alongside the Xeon processors that by and large dominate in the glass house.
Applied Micro has its X-Gene 1 and X-Gene 2 processors ramped, Cavium has its ThunderX in production as well, AMD is gearing up its Opteron A1100, and others are making their way to the field of battle to try to win some share against the X86 chips used in servers and the Power, MIPS, X86, and other specialized ASICs that are used in a variety of networking and storage gear.
At the recent ARM TechCon, ARM Holdings chief technology officer Mike Muller declared that this was the year when ARM-based servers had arrived in his opening keynote.
“I think that we are on the cusp now of having this plethora of ARM-based servers to actually transform what happens in that marketplace,” Muller said. “It is not just about servers, because when you look at it, it is about everything from network infrastructure to wireless access and it is where the ARM business model, with its partnerships, allows you to have very diverse pieces of silicon being put into very different products that scale all the way from scale out servers on one end through to very specialized, dedicated networking and wireless components on the other. Wireless infrastructure is a market segment where we are going to have 20 percent, 40 percent, 80 percent market share over the coming five years. So for us, it is very exciting to see ARM 64-bit technology pervading out from what is our traditional from sensors out to smartphones and really going out and pushing all the way into servers.”
Such a high market share penetration for wireless infrastructure may seem optimistic, but ARM has seen this movie before.
Like Intel’s assault on the datacenter, the ARM incursion is starting on the client side where the volumes are and gradually moving into the switches, storage arrays, and servers that actually run the applications that appear to run on client devices. Back in 2009, Muller explained, about 400 million mobile client devices shipped that year, with about half being smartphones and half being laptops. (He ignored the non-mobile desktop market that Intel has dominated for decades.) Fast forward to 2014, and about 1.9 billion devices shipped, with the overwhelming majority of them being smartphones; the laptop market is more or less flat at 200 million units and tablets make up about 300 million units. About 85 percent of the 1.4 billion smartphones sold in 2014 were powered by an ARM chip, and this share is expected to hold in 2015, too, and interestingly, Muller said that more than half of the ARM-based smartphones would be using a variant of the 64-bit ARMv8-A core architecture. (Apple started off the move to 64-bit mobile client devices two years ago with its iPhone and iPad lines, and others have followed suit.)
By far the most interesting thing that ARM divulged during TechCon was a revelation to Wall Street and chip industry analysts that the company expected to garner 25 percent share of server shipments by 2020. This is an astounding number, and is an increase over the 20 percent share projections that ARM and its partners had been talking about for the past few years and that we discussed at length a few months ago.
What has caused ARM and its collective chip sellers to boldly increase their server market share projections even as the ramp to commercial ARM server shipments has arguably taken years longer than expected? There are a few factors at work here, Lakshmi Mandyam, director of server systems and ecosystems at ARM, explains to The Next Platform.
“If you would have asked me a couple of years ago, I might not have been quite as excited about it,” Mandyam concedes. “But if you look at where we are at the tipping point of having 64-bit platforms, having a base software platforms available with the operating systems and the middleware, and if you also look at the dynamics in the world shifting in terms of regional interest, such as in China, in having their own solutions but yet compete at a global scale. The ARM architecture gives them the ability to do that. The other thing is a shift in deployment modes. We have always said from the beginning that our target is cloud, open source applications, and if we look at where the interest is from scale out companies, it is all Linux and community based software. These are what we see shifting our addressability in the market.”
They days when ARM-based servers can be relegated to a small segment of the market called microservers – which have found a place but are nowhere near the 10 percent of the market many had predicted five years ago when the idea first came into being – are over, says Mandyam. ARM chip makers such as Applied Micro, Cavium, Qualcomm, and Broadcom are targeting high performance computing explicitly, and generally speaking, the ARM collective is aimed at Xeon E3 and all but the most capacious Xeon E5 processors (including the Xeon D aimed specifically at hyperscalers and specifically at microservers like Facebook’s “Yosemite” quad-node Open Compute sled). And Mandyam says that the current ARM designs can compete, pulling out this comparative chart to make the case:
The chart above uses the SPEC_rate_2006 benchmark, which is a throughput metric, to gauge the relative performance of two of Intel’s 10-core, 20-thread current “Haswell” Xeon E5 chips against system-on-chip designs implemented by ARM Holdings with 20 cores and 20 threads. ARM chips using the older Cortex-A57 cores, which are more than three years old, were tested as well as ones using the Cortex-A72 cores, which were announced last February. This is pitting prototype ARM chips and estimates against actual Xeon processors, of course, so it is not precisely an apples-to-apples comparison, and as the chart notes, some of the I/O processing is done by the Xeon CPU and some of the I/O power consumed by the ARM chips is not counted. ARM doesn’t make chips for production, but it has to make the case that an ARM chip can, in theory, compete with at least some portion of the Xeon market.
Six months ago, ARM started using its chips based on its own architecture to run its web site, and Muller was perfectly frank in that ARM wanted to make sure the Hewlett Packard Enterprise “Moonshot” servers based on quad-node ProLiant m400 cartridges based on Applied Micro’s X-Gene 1 processors and the software stack (including the Ubuntu Server variant of Linux from Canonical, the Nginx web server, and the MySQL database from Percona) were robust enough to handle the web traffic.
Mandyam also brought up comparisons that online payment processor PayPal divulged when it moved some of its firewall, virtual private networking, authentication, and other web workloads from traditional (and unspecified) X86 iron to the Moonshot m400 cartridges. PayPal’s math showed that the X86 iron would cost 1.8X as much as the Moonshot machines, would consume 7X the power, and offer one-tenth the node density per rack. PayPal did not say how much cost savings – if any – came from the move to the X-Gene 1 chips, but did say there was a “game changing” cost per watt per cubic foot benefit over the X86 machinery. PayPal subsequently deployed an unknown number of Moonshot servers using hybrid ARM-DSP SOCs from Texas Instruments, based on the ProLiant m800 cartridges, to do complex event processing relating to its own systems and various social media feeds where people often complain about companies to find and fix its online applications more quickly. Comparisons to X86 iron in this case were not divulged.
As we have pointed out many times before, the hyperscalers and cloud builders (particularly those offering software or storage as a service, not raw compute that is tied to the X86 architecture) will be the first movers, and these companies tend to have direct relationships with original design manufacturers in Taiwan and China.
And as for China, it is a special case, as Mandyam points out, in that it wants to have an indigenous architecture that it can control top to bottom. The Chinese government invested heavily in developing the “Godson” variant of MIPS, which was supposed to span from smartphones to supercomputers like ARM is trying to do, but we do not hear much about this anymore from Longsoon Technology except for client devices. (The Godson chips are interesting in that they can emulate X86 and ARM instructions.) China is investing in a variant of the Power8 processor through Suzhou PowerCore and the Institute of Computing Technology, the latter being one of the government bodies that created PC and server maker Lenovo. Chip makers Rockchip Electronics and Allwinner Technology, both from China, have come out of nowhere in the past several years to become dominant suppliers of chips for tablets, and they could get some server aspirations, too. Huawei Technology’s HiSilicon unit is developing its own ARM chips, too, for network infrastructure, and Phytium is working on a monster of a server chip as well.
The point is this: China could back Power chips at the high end for big grunting workloads where heavy cores, memory bandwidth, and tight coupling with coprocessors is needed and ARM for all the rest and have two processors it more or less can control for indigenous products. (It is a kind of big-little strategy that spans a datacenter rather than a chip architecture as ARM has been promoting.) Anything built in China can, in theory, be distributed worldwide – national politics permitting, of course.
For the moment, there are on the order of hundreds of serious proofs of concept using 64-bit ARM servers, according to Mandyam, and Matt Eastwood, senior vice president of the enterprise infrastructure and datacenter group at IDC, tells The Next Platform that there were thousands of ARM server units shipped in 2014, with the numbers still not in for 2015. That is a long, long way away from a market that could be shipping 12 million to 15 million servers in 2020. (That number depends on how you count a server.)
“There are a lot of seed units and there is a lot of testing, but I would not say that anyone has done anything at any kind of scale,” says Eastwood, referring to the ARM server shipments thus far. “So this 25 percent share in only five years’ time by ARM is a pretty bold statement. People wanted to paint ARM into the microserver corner, but microservers never really took off and the fact that there are 64-bit chips now changes the equation a bit. If you add up hyperscale, HPC, and China, that is probably more than half of the unit volumes in the server market – China is the second biggest market and maybe not quite 20 percent of shipments, hyperscalers are about 30 percent, and HPC is 15 percent to 20 percent. If you eliminate the overlap, you get pretty close to 50 percent of total worldwide shipments in these three areas.”
So now all that the ARM collective has to do is get half of those three slices mentioned above and it can hit its 25 percent share target.
As a check against those numbers, Eastwood says that traditional enterprises are now less than half of Intel’s X86 server chip business. So all of that non-enterprise stuff is the easier target for ARM. “That is where all of the exposure is,” says Eastwood.
Despite this, Eastwood doesn’t think that the bold goal is attainable over the time that ARM Holdings has set for itself and its partners. “I don’t think ARM can get to 25 percent by 2020,” he says. “In the world of servers, things are moving faster than ever, but still, things take some time to prove out. The picture I have in my mind is that over the next two to three years, you will definitely see growth of ARM in hyperscale and in China, and you will probably see a pretty good inflection in demand in say 2019 or 2020, but I don’t think we will see 25 percent.”
A lot of things can happen between now and 2020, of course. And one of them is a recession, which tends to accelerate technology transitions. If ARM can demonstrate sustainable advantages, we think it should be able to get a chunk of the market just to give Intel some competition. Whether that is 15 percent, 20 percent, or 25 percent of the market depends on how the chip roadmaps unfold in the coming years and what Intel does to keep ODMs and OEMs aligned with it rather than ARM.