Venture Capitalist at Theory

About / Categories / Subscribe / Twitter

3 minute read / May 23, 2024 /


NVIDIA’s growth is an index on the growth of AI. “Compute revenue grew more than 5x and networking revenue more than 3x from last year.”

Data center revenue totaled $26b, with about 45% from the major clouds ($13b). These clouds announced they were spending $40b in capex to build out data centers, implying NVIDIA is capturing very roughly 33% of the total capex budgets for their cloud customers.

“Large cloud providers continue to drive strong growth as they deploy and ramp NVIDIA AI infrastructure at scale and represented the mid-40s as a percentage of our Data Center revenue.”

NVIDIA has started to highlight the return-on-investment (ROI) for cloud providers. As the prices for GPUs increases, so do NVIDIA’s profits, to a staggering degree - nearly 10x in dollar terms in 2 years. Is this a problem for the clouds?

Fiscal Year Profits, $b Net Income Margin
2020 2.8 26%
2021 4.3 36%
2022 9.7 42%
2023 4.4 26%
2024 29.8 57%
LTM 42.6 62%

That may not matter to GPU buyers - at least not yet - because of the unit economics. Today, $1 spent on GPUs produces $5 of revenue.

“For every $1 spent on NVIDIA AI infrastructure, cloud providers have an opportunity to earn $5 in GPU instant hosting revenue over 4 years.”

But soon it, it will generate $7 of revenue. Amazon Web Services operates at a 38% operating margin. If these numbers hold, newer chips should improve cloud GPU profits - assuming the efficiency gains are not competed away.

“H200 nearly doubles the inference performance of H100, delivering significant value for production deployments. For example, using Llama 3 with 700 billion parameters, a single NVIDIA HGX H200 server can deliver 24,000 tokens per second, supporting more than 2,400 users at the same time. That means for every $1 spent on NVIDIA HGX H200 servers at current prices per token, an API provider serving Llama 3 tokens can generate $7 in revenue over 4 years.”

And this trend should continue with the next generation architecture, Blackwell.

“The Blackwell GPU architecture delivers up to 4x faster training and 30x faster inference than the H100”

We can also guesstimate the value of some of these customers. DGX H100s cost about $400-450k as of this writing. With 8 GPUs for each DGX, that means Tesla acquired about $1.75b worth of NVIDIA hardware assuming they bought, not rented, the machines.

“We supported Tesla’s expansion of their training AI cluster to 35,000 H100 GPUs”

In a parallel hypothetical, Meta would have spent $1.2b to train Llama 3. But the company plans to have buy 350,000 H100s by the end of 2024 implying about $20b of hardware purchases.

“Meta’s announcement of Llama 3, their latest large language model, which was trained on a cluster of 24,000 H100 GPUs.”

As these costs skyrocket, it wouldn’t be surprising for governments to subsidize these systems just as they have subsidized other kinds of advanced technology, like fusion or quantum computing. Or spend on them as a part of national defense.

“Nations are building up domestic computing capacity through various models.”

There are two workloads in AI : training the models & running queries against them (inference). Today training is 60% and inference is 40%. One intuition is that inference should become the vast majority of the market over time as model performance asymptotes.

However it’s unclear if that will happen primarily because of the massive increase of training costs. Anthropic has said models could cost $100b to train in 2 years.

“In our trailing 4 quarters, we estimate that inference drove about 40% of our Data Center revenue.”

The trend shows no sign of abating. Neither do the profits!

“Demand for H200 and Blackwell is well ahead of supply, and we expect demand may exceed supply well into next year.”

Read More:

Chatting With Her - The ChatGPT App on Mac