You Are Paying The Clouds To Build Better AI Than They Will Rent You

Think of it as the ultimate offload model.

One of the geniuses of the cloud – perhaps the central genius – is that a big company that would have a large IT budget, perhaps on the order of hundreds of millions of dollars per year, and that has a certain amount of expertise creates a much, much larger IT organization with billions of dollars – and with AI now tens of billions of dollars – in investments and rents out the vast majority of that capacity to third parties, who essentially allow that original cloud builder to get their own IT operations for close to free.

The IT shop of that original company, be it the Amazon online retailer, the Microsoft software creation and distribution business, and the Google search engine and online advertising business, turns what was a cpost center into a revenue stream and a profit pool.

Given this, the wonder is that Meta Platforms, one of the Super 8 hyperscalers, has never become a cloud provider.

All of the big cloud builders and a bunch of smaller ones specializing in AI training and inference have been investing absolutely huge fortunes in systems to support ever-growing AI workloads. And for the first time, we are beginning to see the secondary effects as cloud customers are starting to spend big bucks renting capacity on these clouds to test and implement GenAI for their workloads.

We like to look at the complete datacenter systems businesses for the top publicly traded IT suppliers anyway, and ultimately we don’t care about the distribution model as we do revenues in the aggregates for acquisition of compute, storage, and networking capacity each quarter and related systems software (whether licensed or rented), technical support, and financing. We think that just looking at the cloud infrastructure is insufficient, especially when Microsoft has such a large systems business with its Windows Server platform. So each quarter, we try to peel it all apart and put it back together to give these IT suppliers their datacenter report cards.

Before getting into the financials for the systems businesses at Amazon, Microsoft, and Google, let’s take a look at enterprise spending in the third quarter for all cloud capacity, including spending on IaaS, PaaS, and host private cloud (what we call outposts after the term created by AWS). Microsoft recently reclassified a whole bunch of cloud servers as SaaS that were originally categorized as IaaS, which has wreaked havoc with all of the models we and others build for cloud spending, but the reclassification did not change anything in the aggregate.

It does mean, however, for those forecasts that look at just IaaS and PaaS, Microsoft will have lower market share than it has had for the past three years. Synergy Research, for instance, had to rework its model (as did we for Microsoft’s “real” systems revenues, which include on-premises and cloudy systems spending by customers each quarter.)

As you can see in the following chart, cloud spending is on the rise and growing faster than it has been for years:

Synergy reckons that in Q3 2024 spending on IaaS, PaaS, and hosted private cloud services was $83.8 billion worldwide, up 23 percent from the year ago period. This is an incremental $15.7 billion in revenues across the dozens of cloud players tracked by the company. And as you might expect, GenAI was the big driver for this growth in spending on cloudy infrastructure. (Exactly how much was spent on renting systems to support GenAI was not revealed by Synergy.)

It is interesting to stare at that chart above. If you track cloud revenue growth down from its peak in early 2021 and follow the curve down as it went through late 2021 and all of 2022, it is not hard to see that cloud spending growth was waning for traditional workloads. The dirty little secret in the cloud business might be, in fact, that spending was already trending down to GDP, probably a multiple of GDP like 1.5X or 2X, which would mean 4 percent to 6 percent growth year on year. Maybe as high as 8 percent.

Now this was the trend, perhaps, with traditional machine learning and HPC helping to boost revenues at the clouds, which is happening in part because it is the hyperscalers and clouds who get the early7 and big allocations of GPUs. So HPC and AI in the cloud is a self-fulfilling prophecy caused by Nvidia and now AMD make the allocations for their devices that they do.

This is great for Nvidia, obviously, and now for AMD and for their cloud partners, who absolutely want to corner the market on GenAI models while at the same time getting us all hepped up about the prospects for GenAI in pour own applications so we will make investments in the cloudy GPUs that only they seem to be able to get their hands on. But the situation is even funnier. As we showed back in May, in How To Make More Money Renting A GPU Than Nvidia Makes Selling It, for every $1 that is spent buying a GPU at something close to list price, a cloud can extract $6.50 back out of renting that GPU for four years. Nvidia gets to make huge revenues and profits, so do the big clouds, and your excitement will ultimately give them back all the money they have blown on GenAI to beat you to artificial general intelligence – and then some.

That is a second order derivative of the offload model. They have a permanent advantage in scale as long as you believe in GenAI.

Freaking genius, isn’t it?

This is what we want you to think about as we drill down in the financial results of the systems businesses of Amazon, Microsoft, and Google, who together represented 68 percent of worldwide cloud revenues in Q3 2024.

We think most of the incremental growth in the cloud business since 2021 is almost entirely attributable to AI. It is hard to prove that, but this is our hunch. This hullabaloo about GenAI is what has saved this business from maturing to a pace of only a 2X multiple of GDP growth. And the scarcity of GPUs and the hype around GenAI is what is driving this business.

Which means this: If Nvidia and AMD and its cloud partners didn’t have an HBM and CoWoS shortage, they would have had to invent one.

Eventually, those shortages will go away, we will enter the “trough of disillusionment” as Gartner calls it in the hype cycle, and GPU compute will be cheaper and ubiquitous. And we will then see revenues and profits normalize among chip makers and system builders and we will see just how useful this GenAI stuff really is.

In the meantime, let’s analyze Microsoft’s systems business, starting with a breakdown of its business by groups:

The restatement of some of its cloud revenues from IaaS and PaaS to SaaS had a dramatic impact on how it reports revenues in its groups, which you can see in the curves above starting with fiscal 2022.

The inevitability of Microsoft becoming the true rival to AWS is evident in shift. As the world’s largest supplier of commercial productivity applications, which knowledge workers use quite heavily and which billions of people use lightly every day, and as the supplier of about half the datacenter software platforms in the world based on Windows Server, there was no question that Office 365 was going to be a huge factor and archiving and backing up SQL Server databases, moving Active Directory to the cloud, and various large-scale data services as well as cloudy versions of its Dynamics ERP suites were going to drive the company to build hyperscale infrastructure to rival AWS. And the money from its client and server platforms would fuel this buildout.

It has taken a decade for Microsoft to become a rival, and it will probably take another decade for Microsoft to reach the size of AWS at the raw infrastructure level, but we think it is similarly inevitable. The customer base is on Microsoft’s side. (And the wonder is why Microsoft didn’t buy Red Hat before IBM did so it could get a lock on the only growing commercial IT platform in the world. Perhaps it knows antitrust law well at this point.)

In the September quarter, which is the first quarter of Microsoft’s fiscal 2025 year, revenues rose by 22.8 percent to $65.59 billion, which was down 2.3 percent sequentially. Operating income was up 26 percent to $30.55 billion, which was an impressive 46.6 percent of revenues. Net income was up 10.7 percent to $24.67 billion.

Microsoft ended the quarter with $78.42 billion in cash and equivalents, much lower than usual in recent years, and part of the reason is that it spent $20 billion in capital expenses, mostly relating to Azure and mostly due to AI clusters, following fast on the heels of $19 billion in spending in Q4 F2024 and $14 billion in Q3 F2024. Since Q1 2023, when the GenAI boom started in earnest, Microsoft has spent a whopping $107.6 billion on infrastructure.

Chew on that for a second. Consider that cud before you re-swallow it. Let it ferment a little.

Now, let’s start peeling a bit. Microsoft Cloud, the broadest definition of cloud distribution of Microsoft products encompassing all IaaS+PaaS+SaaS, drove $38.92 billion in revenues in fiscal Q1, up 22.4 percent and comprising 59.3 percent of total revenues.

Of the three groups, the Intelligent Cloud group at Microsoft is the closest thing to a datacenter business, but it is too broad still to be considered a measure of the core systems business drive by Microsoft for both on-premises and cloudy infrastructure in that Intelligent Cloud includes all of the middleware and database software and services, as well as other services, that Microsoft peddles into the datacenter. Anyway, Intelligent Cloud revenues were up 20.4 percent to $24.09 billion, with an operating income of $10.5 billion, up 17.9 percent and representing 43.6 percent of sales.

As part of the new categorizations announced in August and shown for the first time this quarter, Microsoft has a new division called Server Products and Cloud Services, which includes Azure services and sales of the Windows Server stack. In the September quarter, this tighter platform category accounted for $22.16 billion in sales, up 22.7 percent year on year.

We like to get a sense of the revenues Microsoft gets from sales core infrastructure capacity and systems software, and here is how we model its “real” systems business taking out the databases, other middleware, and development tools that run on platforms on premises or in the cloud:

We think this core systems business raked in $16.11 billion in revenues, up 33.6 percent, and had operating income of just a tad over $7 billion, up 30.9 percent, in Q1 F2025. That core systems business represents 43.6 percent of Microsoft’s total revenues but only 23 percent of its operating income. Microsoft makes a lot more margin on its Office stack and on Dynamics apps. As GenAI hardware is more broadly rented to enterprises, we expect the margins to rise dramatically for this core systems business. The fact that it has not done so in the past several years just shows you what a sweet deal OpenAI must have got for GPU clusters as part of its partnership with Microsoft.

Remember how much you are helping Microsoft and therefore OpenAI when you rent that GPU instance on Azure.

The GPU effect seems to already be boosting revenues and profits at AWS:

In the September quarter, the Amazon conglomerate had $158.88 billion in revenues, up 11 percent, with operating income of $17.41 billion, up 55.6 percent, and net income of $15.33 billion, up 55.2 percent. AWS had $71.67 billion in cash, up hugely (43.2 percent) from a year ago, and spent $22.83 billion on capital expenses. We don’t know for sure how much of that was for distribution centers and their transportation systems and how much was for AWS datacenters, but our guess is that $18.3 billion of that was for infrastructure and $11 billion of that was for AI clusters alone. (Those are just hunches,)

Since the GenAI boom started, we think AWS might have spent around $72 billion on infrastructure, and maybe $30 billion to $32 billion of that is for AI clusters.

Now, if you pull the AWS part of the Amazon revenue streams out, the underlying Amazon retail, advertising, and media businesses brought in $131.43 billion in sales, up 9.5 percent, but operating income was cut by a factor of 10X to $406 million. For five quarters prior to this, the Amazon business outside of AWS was making profits ranging from 2.1 percent to 5 percent of revenues, and now it is back to being, well, a low margin retailer.

In Q3 2024, AWS revenue rose by 19.1 percent to $27.25 billion, and operating incomes were $10.45 billion, up 49.8 percent compared to the year ago period. You can see the GPU and possibly Trainium and Inferentia bounce in profits.

You can also see how the GenAI boom has helped get AWS growing again:

What we really want to know is how AWS revenues break down into compute, storage, networking, and software, and of course, the company does not provide such a useful breakdown in revenue categories.

We built a model many years ago based on the few slivers of data that AWS used to give out about compute and storage, and we have kept that model going based on hunches about what is going on in the market at large and what we think might be happening inside of AWS. We realize this is just a thought experiment and is not meant to be considered as “data” and is only being offered in the absence of real data, but it illustrates a point:

We think that storage and networking revenues for AWS has flatlined since the GenAI boom started and that compute revenues are now growing faster than SaaS service revenues, which comprise a little less than half of the revenues for AWS these days. In our model, compute revenues at AWS were up 58.7 percent to $7.69 billion; storage revenues were down 3 percent to $3.57 billion; and networking revenues were down 18.2 percent to $3.02 billion.

But again, this is made up data that is meant to encourage Amazon to start talking about the AWS business in this proper fashion.

That leaves us, finally, to Google. And no, we are not going to call it Alphabet, which is just a holding company dominated by Google.

In the September quarter, Google revenues rose by 13.6 percent to $88.27 billion. Operating income was up 25.6 percent to $28.52 billion, and net income rose a little faster, up 28.6 percent to $26.3 billion, representing 29.8 percent of sales.

Google ended the quarter with $93.23 billion in cash and equivalents, down 14.9 percent year on year. Google spent $13.1 billion on capital expenses in Q3 2024, up 62 percent compared to last year. Google said on a call with Wall Street analysts going over the numbers that it would spend about the same amount in Q4 2024, which would put its spending at $51.4 billion in 2024 and $32.3 billion in 2023. Since the GenAI boom started in Q1 2023 through Q3 2024, Google has spent $70.63 billion on capital expenses.

If you add up the Big Three, that is $250 billion in datacenter infrastructure spending since the GenAI boom started, with Microsoft outspending Google and AWS by a considerable margin. That’s because Microsoft has two big AI customers: itself and OpenAI.

In Q3, Google Cloud had $11.35 billion in revenues. Up 28.8 percent, and operating income of a mere $1.95 billion, up nearly 3X over the year ago period. Google is getting better and better at growing and running its cloud business. We have yet to build a model that extracts the SaaS portions of Google from the IaaS and PaaS and hosted private cloud portions.

A funny thought as we consider the Google Cloud business. If Google wanted to grow its cloud market share, all it would have to do is have Google Cloud buy its entire fleet of datacenters and then sell access to that as a service to the other parts of the Alphabet empire. Borg and Omega and Spanner as a service. Then its costs for those search engine, ad serving, and video serving revenues streams for Alphabet turn into revenues for Google Cloud. (Intel is doing much the same with its compute engine and foundry businesses.) Such a move would make the systems business at Google Cloud immediately look to be at the same scale as AWS and Microsoft, we think.

Instead, we have the relative system business sizes of Microsoft, AWS, and Google look like this:

It’s weird, but then again not, that Microsoft and AWS grow in lockstep. They have very different businesses, with Microsoft still having a very large on-premises systems software business.

Anyway, the relative operating income of these companies looks like this:

It’s something to think about, Sundar Pichai and new chief financial officer, Anat Ashkenazi, who is replacing Ruth Porat as she becomes Alphabet president and chief investment officer.

Eric Olson says:

November 5, 2024 at 1:27 pm

In the past CompuServe sold time on a fleet of 36-bit DECSYSTEM-20s with profits similar to the rentable GPU cluster providers today. Then CPU hardware became plentiful and cheap.

As long as geopolitical forces don’t create even more scarcity, it seems likely that GPU supply will catch up to demand. The difference, as already pointed out, is that Amazon, Google and Microsoft need the same infrastructure for their primary business. Therefore, bankruptcy may not happen after GPU clouds become less profitable.

Donna Hartd says:

November 5, 2024 at 11:56 pm

Off topic: What in the world is going on at SuperMicro?? Looks like they’re basically asking for NVidia to cut their Blackwell allocations, I’m sure Big Green would like to have assurance that they’ll get paid for their goods.

- Timothy Prickett Morgan says:
  
  November 6, 2024 at 8:52 am
  
  I have no idea, but when they actually release their financials and forecasts beyond the preliminaries that came out yesterday, which tell us very little, I will dig into it.
  
Slim Albert says:

November 6, 2024 at 11:38 am

Great analysis and update (eg. from last Dec. 18)! And with “$250 billion in datacenter infrastructure spending since the GenAI boom started”, just from the big 3, It’s no wonder Lisa Su had to update her TAM slides last month ( https://www.nextplatform.com/2024/10/18/the-world-will-eat-2-trillion-in-ai-servers-ai-will-eat-the-world-right-back/ ). Add to that the (non-cloud-provider) 100,000 to 200,000 GPUs in xAI’s Colossus, and Meta’s seeking to have more than a 1/2 million H100-equivalent GPUs before 2025, including MI3xx (among others), and wow … it’s a definite boom!

(PS.: should the x-axis on the last two plots extend to 2024?)

- Timothy Prickett Morgan says:
  
  November 6, 2024 at 9:02 pm
  
  Yes. I will fix. Thanks for the catch. I thought I had done that.

5 Comments

Leave a Reply Cancel reply

Sign up to our Newsletter

Related Articles

Red Hat Saddles Up For The Wide Open GenAI Horizons

Co-Location Plays A Big Role In Hybrid Cloud, Too

How Long Before Broadcom Makes More AI Compute Engines Than Nvidia?

5 Comments

Leave a Reply Cancel reply