AMD doesn’t talk much about servers these days, but it looks like the company is getting ready to revamp its server component business just as it is shutting down its SeaMicro unit. The company cannot live on the margins of the PC and game console business alone and it knows it must come up with a battle plan to carve out its share of revenues and profits from the datacenter.
The first step in this process has been to decide that AMD cannot sell SeaMicro machines against the same server makers that it wants to buy its processor components. New CEO Lisa Su announced during its conference call going over its first quarter financials that AMD was writing off its investments in the microserver and interconnect innovator and shuttering that business.
Back in February 2012 when Rory Read was the relatively new CEO at the company, AMD shelled out $334 million to acquire the SeaMicro business. The SeaMicro machine combined Intel’s Atom processors with a 3D torus interconnect and a special ASIC that virtualized all of the connections between processors and I/O devices like storage and network interconnects. While AMD added support for its own single-socket Opterons to the machines, it was Intel’s own Xeons that were the most popular motors in the hyperscale systems. Last year, Verizon stood up its public cloud on heavily modified variants of the SeaMicro systems, constituting what was SeaMicro’s biggest win to date. But the fact that the SeaMicro Freedom Fabric interconnect that lashed the 64 server cards in the system together could never scale beyond a single chassis hampered its adoption among cloud and hyperscale companies, despite the benefits of microservers.
Su seemed to put the blame on the microserver concept in discussing the shutdown of the SeaMicro unit, but then admitted that the competition with server partners was probably an issue. “If you think about SeaMicro, it was really specializing in microservers,” Su explained on the call. “And microservers have not developed at the pace we might have thought a couple of years ago. From the way we look at the market, our core competency is really in processors and being able to service that business either through standard or semicustom products. And that is the way we will address the server business.”
AMD is taking a $75 million writeoff against the SeaMicro assets, but CFO Devinder Kumar said that the intellectual property of the Freedom Fabric interconnect “is still available for us as needed.” That does not sound much like a plan to add proprietary interconnects to future Opteron X86 and ARM server processors.
Incidentally, there are those who believe that microservers will find their place in the datacenter, and Facebook is one of them with its new “Yosemite” microserver design, which was shown off at the recent Open Compute Summit. That microserver is based on Intel’s new “Broadwell” Xeon D processor, a low-power yet reasonably high performance processor that was designed explicitly to take on Opteron APU and ARM processors that might be deployed in microservers. There are workloads – including Intel’s own electronic design automation (EDA) system, which should break 1 million cores this year – where single-socket servers are better suited to handle massive numbers of single-threaded jobs at the same time compared to beefier two-socket Xeon systems.
The question now is what will AMD do to revamp its server processor business, and what will it do with the SeaMicro interconnect, or indeed any interconnect, as it tries to push a unified platform that can support both X86 and ARM processors. The company has put together a new server team, and one that is well connected to the hyperscale and HPC communities and that also has plenty of experience in designing systems for enterprises, so there is hope that AMD can be more of a force in datacenters in the coming years.
The Double Whammy
As far as AMD’s server business is concerned, chip maker AMD has never quite recovered from the Great Recession. In a truly unfortunate – we are talking billions of dollars in potential lost revenue here – circumstance of timing, after years of taking market share away from Intel in the server space, the “Barcelona” four-core Opteron processors were delayed by a bug just as the housing bubble was bursting in late 2007. The Barcelona bug was a big deal for key OEM customers, but was something that AMD could have absorbed had it not been for the recession and for the fact that Intel was putting the finishing touches on its “Nehalem” Xeon 5500 processors designs. By the time that the recession was in full swing in late 2008, server makers were looking for a sure bet and everyone lined up behind the Nehalem chips, which as we all know negated some of the substantial advantages that AMD had in the server space for many years. The Nehalem chips were the first Xeons to have integrated memory controllers and to ditch the front side bus, which had been bottlenecking server performance for years, and the QuickPath Interconnect (QPI) bus that is analogous to the HyperTransport interconnect used in Opterons to great advantage.
Time and again in the systems business, major shifts in technology are caused as much by recessions as they are by the technology of the time. To be even more precise, recessions accelerate transitions that are usually already underway, since the market is always looking for lower cost, more performance, or both. Minicomputers saw their rise in the recession of the early 1980s and Unix systems got their push in in the recession of the late 1980s and early 1990s. The ascent of Linux and open source might have been a political movement in the datacenter as much as a means to creating new kinds of less expensive, more malleable systems, but it was the Dot-Com Bust that accelerated the adoption of Linux in the enterprise.
The larger point is that Intel has not looked back since launching the Nehalem chips in March 2009, and its Data Center Group, as we reported earlier this week, has shown nearly linear and very predictable growth in revenues and operating profits since then, even managing to grow through the Great Recession and only showing a bit of flattening in late 2011 and early 2012 as enterprises took a pause in spending that HPC, hyperscale, and cloud shops did not.
A crucial partnership that AMD had with supercomputer maker Cray unwound because of the Opteron delays, which compelled Cray to move away from tying its various interconnects tightly to the HyperTransport links on Opterons and use more generic PCI-Express 3.0 links. Intel had PCI-Express 3.0 links on its Xeon processors years ahead of AMD on its Opterons, and eventually Intel went so far as to buy the “Gemini” and “Aries” interconnects from Cray for $140 million in April 2012, a few months after AMD bought SeaMicro. (Intel had already snapped up Fulcrum Microsystems, a maker of Ethernet switch ASICs, and would soon buy the InfiniBand networking business from QLogic.) It is interesting to think that this Cray acquisition was in response to AMD’s acquisition of SeaMicro, but it is probably more accurate to say that both companies saw a need to get interconnect expertise and AMD was looking for a way to fill the gap that was coming as Cray shifted from Opteron to Xeon motors in its massively parallel systems.
SeaMicro just couldn’t fill in that gap. AMD has been conflicted between selling its Opterons inside of SeaMicro boxes or pushing them inside of HP’s Moonshot machines. Perhaps, if AMD had spun out the SeaMicro server business to a third party manufacturing partner, got its Opteron processors inside of the SeaMicro server cards faster, and just kept the Freedom Fabric and extended it, things would have worked out differently. The third party support was key, and this is the pattern that Intel is establishing for high-end supercomputers. With the future 180 petaflops “Aurora” supercomputer for Argonne National Laboratory, Intel is the primary contractor, but Cray is being tapped as the systems integrator. AMD could have partnered with Hewlett-Packard, Dell, or Supermicro to be the manufacturers of SeaMicro designs and shared the money, which would have perhaps resulted in a better ending. But perhaps what this really means is that server buyers – and particularly hyperscale and cloud shops that want an open Ethernet stack so they can tweak it like Verizon eventually did with the SeaMicro virtualized network – didn’t want a proprietary interconnect.
AMD Server Roadmaps Loom
The one thing that AMD can’t afford to do is leave Wall Street and server makers and buyers with the impression that it is exiting the server business. So Su was very clear about this on the call with Wall Street.
“As we prioritize our R&D investments and simplify our business, we made the decision in the first quarter to exit the dense server systems business as we increase investments in our server processor development,” Su explained. “We retained the fabric technology as a part of our overall IP portfolio. We see very strong opportunities for next generation high performance X86 and ARM processors for the enterprise datacenter and infrastructure markets, and will continue to invest strongly in these areas.”
AMD is still sampling its “Seattle” Opteron A110 ARM processor, which has eight Cortex A57 cores, up to 4 MB of L2 cache and 8 MB of L3 cache, and 128 GB of DDR3 or DDR4 main memory shared by those cores. Significantly, the Seattle chip has a 10 Gb/sec and a 1 Gb/sec Ethernet port on the die, and not a Freedom Fabric interconnect port. Su said that server makers are still designing systems for the Seattle chip and that volume shipments for it would occur in the second half of this year.
We talked to AMD several weeks ago about its HPC plans, and Karl Freund, general manager for HPC at AMD, was pretty tight lipped about the company’s plans in this market sector. (Freund ran marketing for IBM’s Power Systems business in its heyday back in the dot-com boom, and has run marketing for its System z business as well as marketing for defunct server ARM chip upstart Calxeda.)
Freund hinted that some funds that AMD has received under the DesignForward program from the U.S. Department of Energy (DoE) in conjunction with the National Nuclear Security Administration (NNSA) have been used to do interconnect work. Funds from the DesignForward program as well as a separate one called FastForward are being used to do research related to memory technologies that will be needed as DDR4 is running out of gas.
Rumors have been going around for the past several weeks about future AMD products aimed at the HPC space, including an HPC-focused Accelerated Processing Unit, which is what AMD calls a hybrid CPU-GPU compute component. It seems reasonable to speculate that AMD will try to tightly couple FirePro-class GPU cards with its ARM and X86 processors, and this is what some of these leaked (and not terribly detailed) roadmaps seem to be showing.
An AMD spokesperson would neither confirm nor deny the reports coming out of a briefing in Osaka Japan last month. “As you know we don’t comment on rumor or speculation,” the spokesperson said. “The glorious five year plan story also takes a few leaps from what was actually presented. All that said, we have our financial analyst day the first week of May and will talk through longer-term plans for server and other markets.”
The Next Platform will be attending that analyst day meeting in New York and will be meeting with Forrest Norrod, who is senior vice president and general manager of Enterprise, Embedded, and Semi-Custom Business Group at AMD. Norrod was a chip designer at Cyrix and National Semiconductor and is notable because he ran the Data Center Solutions custom server business when it was established in 2007. That unit had shipped 1 million machines through 2012, and its success was one of the reasons why Norrod was eventually put in charge of the systems business at Dell. And while no one talks about it much, there are still plenty of DCS customers using Opteron processors even if the Opterons do not get as much play as they used to.
As far as we know, AMD is moving ahead with its “Skybridge” effort, which was announced last May and which seeks to put Opterons with X86 or ARM motors in the same sockets and with the same network fabrics and accelerator interconnects. There is also chatter about future a sixteen-core Opteron chip with a beefy GPU, which looks very interesting. This supposed chip is based on AMD’s “Zen” core, which has simultaneous multithreading support akin to Intel’s HyperThreading, and is said from the leaked presentation slide to have 8 MB of L2 cache spread across the cores, a 32 MB L3 cache. The chip, which is referred to as a “platform processor,” also includes an AMD “Greenland” stream processor that supports single precision and double precision floating point calculations and up to 16 GB of High Bandwidth Memory (HBM) for the graphics processing unit, with a pretty impressive 512 GB/sec of bandwidth between the Zen processors and the Greenland GPU. The four channel DDR4 memory controller on the die can support up to 256 GB per channel, for a max of 1 TB of memory, which is pretty respectable. The chip also has 64 lanes of PCI-Express 3.0 I/O bandwidth to hook peripherals on.
For AMD”s sake and for that of its potential customers looking for a very impressive ceepie-geepie, we sure hope this is real. AMD will no doubt start telling its new server story on May 6, and explain how this part – or derivatives of it – will be used in future systems.
I would prefer to see the 4/8gb HBM memory used as an L3 cache and ddr4 for expanded memory
(128 -> 256gb ecc).