If NSF Snoozes, Then TACC’s “Horizon” Supercomputer Loses

It is a tumultuous time for any agency in the US government or any company or organization that depends on the US government for a sizable portion of its funding or revenue. The National Science Foundation is front and center on proposed cuts in spending by the Trump administration, and those cuts, if they should come to pass, will undercut scientific research and the computing tools that make it possible.

We are obviously not interested in any cutting in HPC research and systems, and in fact have long advocated for an increase in spending that is commensurate with the vast sums that are being blown on AI systems, which have yet to prove their usefulness even if AI systems are getting increasingly capable and accurate. (See The Future We Simulate Is The One We Create, from January 2024, and its counterpoint, The Future Is The One We Generate, from January 2025.)

The National Science Foundation is the funding conduit for lots of research and around half of the spending in the United States for reasonably large capability-class (strong scaling for large workloads) and capacity-class (weaker scaling for many workloads) supercomputers. The other half comes from the Department of Energy, which had its two exascale machines before Trump took office and DOGE started revving its chainsaw. Given that the Texas Advanced Computing Center at the University of Texas runs the flagship supercomputer of the NSF, and that the future “Horizon” supercomputer will be the most powerful NSF machine ever built, we are not happy to hear that the budget for Horizon is under threat.

A report out of Science, a publication put out by the American Association for the Advancement of Sciences, says that a fight between Congress and the Trump administration over what constitutes “emergency spending” at the NSF could delay the construction of the Horizon machine and its installation at a Sabey Data Centers facility in Austin.

Before getting into the details, it is important to note that AAAS has the US government as its leading funder, but it also gets a bunch of money to disseminate information about science and technology from a slew of foundations created by the titans of industry over the past two centuries. This gives Science a certain degree of freedom to bite the hands that feeds it when actions warrant it.

According to the report, Congress gave the NSF $234 million in March as part of its Major Research Equipment and Facilities Construction (MREFC) budget, which in turn is part of a $9 billion budget for fiscal year 2025, which began last October. Most of that $234 million was apparently allocated to continue building the Horizon system, which is expected to launch sometime in early 2026. Congress designated $12.4 billion in spending across of infrastructure projects as being emergency spending within an overall $1.9 trillion Congressional budget package that was approved on March 15. President Trump was not pleased with all of that emergency spending, and put the kibosh on $2.9 billion of that emergency spending, including the $234 million allocated for the NSF, of which $154 million was earmarked for the ongoing construction of the Horizon machine.

And now, Congress and the White House are arguing about what is an emergency and what is not, and if President Trump even has the right to stop spending on something that Congress has approved and that has been signed by him.

Details on the future Horizon machine are a bit thin. Back in November, TACC put out a statement saying that Horizon would weigh in at 400 petaflops peak theoretical performance on 64-bit floating point math, and deliver about 10X the performance of the all-CPU “Frontera” system currently at TACC’s datacenter on the UT campus. For AI work, which uses lower precision math, Horizon will be around 100X as powerful as Frontera, which cost $60 million and which is rated at 38.75 petaflops peak performance at FP64 precision.

Back in September, TACC announced it was working with Nvidia and Dell to create a stop-gap, interim machine between Frontera and Horizon called “Vista,” which is using a mix of compute nodes based on Nvidia’s “Grace” CG100 Arm CPUs and its “Hopper” GH100 GPUs. To be more precise, Vista has a mix of Grace-Grace and Grace-Hopper nodes, and the system was activated in September 2024. There are 600 nodes using Grace-Hopper superchips and 256 nodes using Grace-Grace superchips.

Vista is running at the TACC facility, not at the Sabey datacenter where Horizon will sit if the funding spigot is not turned off. The GPUs deliver 20.4 petaflops of at FP64 precision on the Hopper vector cores and 40.8 petaflops on their tensor cores. The 512 CPUs in the Grace-Grace superchips deliver 1.8 petaflops at FP64, and the 600 Grace CPUs in the Grace-Hopper superchips run a bit faster and deliver 2.3 petaflops. Add it all up, and you have 44.9 peak aggregate petaflops of FP64 compute.

TACC has not said this, but there is every reason to believe – given the choice of Nvidia compute engines in the Vista system and the timing of the delivery of the machine next year – that Horizon will be a combination of the “Vera” follow-on to the Grace CPU, which will have 88 cores, and the “Rubin” GR100 GPU or one of its variations. Our guess is that the machine will be a mix of Vera-Rubin and Vera-Vera superchips, similar to the architecture for Vista.

The total budget for Horizon was set in July 2024 at $457 million, which includes operational and facilities funding as well as money for the machine itself. It is not clear how much of that is for the machine itself.

We hope that Congress and the Trump administration can work it out and keep Horizon in track for TACC. A lot of research is counting on it, and if there is slip in the schedule, there is a good chance that TACC will lose its spot in the line of organization waiting to be allocated Nvidia GPUs.

Sign up to our Newsletter

Featuring highlights, analysis, and stories from the week directly from us to your inbox with nothing in between.
Subscribe now

Be the first to comment

Leave a Reply

Your email address will not be published.


*


This site uses Akismet to reduce spam. Learn how your comment data is processed.