If AMD is willing and eager to spend $4.9 billion to buy a systems company – that is more than its entire expected haul for sales of datacenter GPUs for 2024 – then you have to figure that acquisition is pretty important. And so it is with the deal to buy ZT Systems, a boutique maker of high performance systems based in the old school datacenter town of Secaucus, New Jersey.
To get to Secaucus, you cross the Hudson River from New York City and then wade through some of the swamps of the Meadowlands, where a lot of things have happened over the years, including the building of the stadium shared by the Giants and Jets football teams and perhaps the fitting of a pair of cement slippers for Jimmy Hoffa. It is also nearby the spot where founding fathers Aaron Burr and Alexander Hamilton had a duel, with disastrous results for both.
But Secaucus is also where some radio and television transmission towers and studios are located, and being a stone’s throw from Wall Street – well, with a pretty small stone but a pretty big arm, like the one on the Statue of Liberty perhaps – and it quickly became a hot spot for high frequency datacenters in the 1990s. Eventually, the New York Stock Exchange and the NASDAQ also moved their datacenters out there because power and space are cheaper than in Manhattan. And Equinix, CyrusOne, QTS, Centersquare, and others followed suit, building datacenters that are close enough to New York City to be extremely useful but also outside of the Big Apple and therefore more safe from natural and unnatural disasters.
In 1994, as this HFT boom was quietly happening near the swamps of New Jersey, ZT Systems was manufacturing PCs and SMB servers and had a pretty good business. But in 2004, the company refocused on providing the kind of high performance servers for financial services firms needed by HFT and other fintech customers. In 2010, ZT Systems started delivering rackscale infrastructure, and in 2013 it took down its first hyperscaler and cloud builder customers. These days, ZT systems has manufacturing facilities in Secaucus, in Georgetown, Texas (outside of Austin), and in Almelo, the Netherlands (east of Amsterdam), and ships hundreds of thousands of servers a year and generates $10 billion in revenue. Yes, that is a lot of GPU servers, isn’t it?
This is probably the biggest server maker you have never heard of, and while the company used to have a lot of fintech customers and still sells to them, the vast majority of its revenues – and we mean all but maybe 1 percent of it – comes from about a dozen hyperscaler and cloud builder relationships that ZT Systems has built up over the years.
Frank Zhang, founder and chief executive officer at ZT Systems, will continue to run the manufacturing business and honor the commitments the company has to existing customers after AMD completes the acquisition of the company, which it expects to close early next year. In the meantime, Zhang will be looking for someone to buy the manufacturing operations, which has around 1,500 employees, because AMD is not interested in being in the server manufacturing and server selling businesses and therefore competing with its customers. Unlike another prominent GPU system maker we know. . . .
Besides, AMD already had that experience with microserver innovator SeaMicro, which it acquired in March 2012 for $334 million under chief executive officer Rory Read (remember him?) and just as Lisa Su came in from IBM Microelectronics to head up its global business units and which it shut down in April 2015 as it hit reset the server business after Su became president and CEO.
“Obviously, we started talking to all our OEM and ODM partners already,” Forest Norrod, general manager of the datacenter business at AMD and formerly the head of the custom server business at Dell, tells The Next Platform. “And I think the reassuring thing is that all those conversations have all been very good, very positive. People immediately get why we are doing this, and they appreciate and understand and believe that we have no intention of competing with them. Not going to do it, not going to happen. I understand super well both businesses, and I’m not confused.”
What AMD is going to do is up its game in systems architecture and engineering. At the moment, AMD has somewhere around 500 system engineers, Norrod estimates, but ZT Systems has 1,100 people that do this work. And given that AMD doesn’t just build systems to one standard, but multiple standards, it needs that many more people to help it design and build to test – but not manufacture for production – the future GPU accelerated systems that are going to be a challenge for it. It is not clear what AMD will get as it spins off that ZT Systems manufacturing business, but amassing 1,100 system engineers with deep, real-world experience would not only cost billions and billions of dollars, but it may not be possible to get it any other way than acquiring a boutique high performance system manufacturer like ZT Systems.
It’s cheaper than acquiring Supermicro. . . . and probably with the same level of system engineers.
Here is the situation as Norrod explains it to us, and we are giving you the full quote so you understand the case that AMD is making for spending $4.9 billion for ZT Systems, which works out to $4.45 million per system engineer. (Some of that will be offset by the spinout of the manufacturing operations, of course.) Here is how Norrod put it:
“We have been looking forward at the roadmap and understanding the complexities of designing competitive and leadership performance and efficiency into systems. With AI systems it is increasingly clear to everyone in the industry that this is going to drive tremendous challenges in designing systems at these power levels, at these signaling rates, at the this just this level of complexity. It is going to be difficult to keep them up and running, and manageable.”
“There are a whole set of issues that need to be resolved and the requirements to meet those are driven way back up into early into the silicon development process. We are familiar with some of these issues because they are issues you run into as you create supercomputers. But when you look at what’s going on with AI systems, the complexity is soaring so fast that it is absolutely critical that we have an adequate number of world-class system design engineers helping us from very beginning of the silicon definition process. And so it became clear that we needed to substantially up our game here.”
“What makes it even more complicated is, as we up our game, we want to hold true to AMD’s heritage of open ecosystems and customer choice – not locking things down in proprietary fashion. And so that means that we need even more folks. Because if you’re trying to do one proprietary system and have everybody in the world use it and do anything they want with it as long as they keep it exactly the same, you need a certain number of folks. If you want to develop open ecosystems and support choice and variation, that’s even more complicated in that it requires even more system engineers to maintain time to market and to maintain high quality.”
This is really about time to market and upping the system design and engineering game. AMD has done a great job putting together excellent CPUs and now GPUs, but it needs to put together a networking stack and system boards, and have it all fit in rackscale and cluster-wide system designs that are tested and validated at scale. This is why Nvidia created the DGX line, and AMD agrees that it needs to do this but it is not going to build systems for customers and it is not going to be a prime contractor for HPC or AI clusters. Unlike Intel tried to be and failed at pretty badly.
Doug Huang, who was a director of engineering of the Data Center Solutions division at Dell under Norrod, was hired to be vice president of platform engineering at ZT Systems in January 2013, when it pivoted to rackscale systems. Huang rose through the ranks to head up engineering and global manufacturing at ZT Systems and was named president of the company in January 2023. Huang is going to stay on at AMD and run the combined team of around 1,600 system designer and engineers.