Energy Is All You Need

Last updated June 18, 2024 View on GitHub Commit cd4bad3

Energy is all you need

There are two crucial currencies for the future:

FLOPS (computational power)
Watts (energy efficiency)

We need more intelligence and more energy.

Computation power

Training OpenAI’s GPT-4 required an estimated 21 billion petaFLOP (petaFLOP is 10¹⁵ floating point operations).
iPhone 12 is capable roughly 0.01 petaFLOP. It would take 60,000 years to train GPT-4 on iPhone 12.
NVIDIA GB200 NVL72 is capable of 1440 petaFLOP in single rack.

Training compute of frontier AI models grows by 4-5x per year

Notable models include:

AlexNet (c. 2012, ~10¹⁸ FLOP)
AlphaGo Master (c. 2016, ~10²² FLOP)
AlphaGo Zero (c. 2017, ~10²³)
GPT-3 (c. 2020, just below 10²⁴ FLOP)
PaLM (c. 2022, ~10²⁴ FLOP)
GPT-4 (c. 2023, above 10²⁴ FLOP)
Gemini Ultra (c. 2023, near the top right, ~10²⁵ FLOP)

The overall trend shows a massive and consistent increase in the computational power required to train frontier AI models over the last decade.

Key needs of training and infrence

AI training workloads have unique requirements that are very dissimilar to those of typical hardware deployed in existing data centers.
First, models train for weeks or months, with network connectivity requirements being relativity limited to training data ingress. Training is latency insensitive and does not need to be near any major population centers.
AI training clusters can be deployed essentially anywhere in the world that makes economic sense, subject to data residency and compliance regulations.
The second major difference to keep in mind is – AI training workloads are extremely power hungry and tend to run AI hardware at power levels closer to their Thermal Design Power (TDP) than would a traditional non-accelerated hyperscale or enterprise workload.
Additionally, while CPU and storage servers consume on the order of 1kW, each AI server is now eclipsing 10kW. Coupled with the insensitivity towards latency and decreased importance of proximity to population centers, this means that the availability of abundant quantities of inexpensive electricity (and in the future - access to any grid supply at all) is of much higher relative importance for AI training workloads vs traditional workloads.
Inference on the other hand is eventually a larger workload than training, but it can also be quite distributed. The chips don’t need to be centrally located, but the sheer volume will be outstanding.

AI infrastructure demand

Estimates of power

The IEA’s recent Electricity 2024 report suggests 90 terawatt-hours (TWh) of power demand from AI data centers by 2026, which is equivalent to about 10 Gigawatts (GW) of data center Critical IT Power Capacity, or the equivalent of 7.3M H100s.
It’s estimate that Nvidia alone will have shipped accelerators with the power needs of 5M+ H100s (mostly shipments of H100s, in fact) from 2021 through the end of 2024.
Datacenter power capacity growth will accelerate from a 12-15% CAGR to a 25% CAGR over the next few years.
Power demand will surge from 49 Gigawatts (GW) in 2023 to 96 GW by 2026, of which AI will consume ~40 GW.
The need for abundant, inexpensive power, and to quickly add electrical grid capacity while still meeting hyperscalers’ carbon emissions commitments, coupled with chip export restrictions, will limit the regions and countries that can meet the surge in demand from AI data centers. This will also push carbon emission commitments as less important.

Global data center critical IT power (Megawatts - MW)

Year	Non-AI Data Center Critical IT Power (MW)	AI Data Center Critical IT Power (MW)	Total Critical IT Power (MW)
2022	40,000	3,000	43,000
2023	42,000	7,000	49,000
2024	45,000	14,000	59,000
2025	50,000	20,000	70,000
2026	56,000	40,000	96,000
2027	60,000	60,000	120,000
2028	62,000	85,000	147,000

Estimated hyperscaler data center capacity (MW)

Company	2022 Capacity (MW)	Projected future capacity (MW)	Total (MW)
Google	3024	2905	5929
Microsoft	2176	3344	5520
Amazon	2480	2533	5013
Meta	1790	2595	4385
Apple	600	1403	2003
Alibaba	1350	487	1837
Huawei	494	192	686
Baidu	608	36	644
Tencent	487	152	639

Estimated data center construction by region (MW)

Region	2023 Inventory (MW)	Under construction (MW)	Total (MW)
Northern Virginia	2499	1237	3736
Atlanta	310	733	1043
Dallas-Ft. Worth	565	287	852
Chicago	560	118	678
Hillsboro	262	281	543
Silicon Valley	428	113	541
Phoenix	360	164	524
New York Tri-state	190	126	316

Current overview of huperscalers

OpenAI has plans to deploy hundreds of thousands of GPUs in their largest multi-site training cluster, which requires hundreds of megawatts of Critical IT Power.
Meta discusses an installed base of 650,000 H100 equivalent by the end of the year. They have currently 1M H100 in order.
GPU Cloud provider CoreWeave has big plans to invest $1.6B in a Plano, Texas facility, implying plans to spend for construction up to 50MW of Critical IT Power and install 30,000-40,000 GPUs in that facility alone with a clear pathway to a whole company 250MW datacenter footprint (equivalent to 180k H100s).
Microsoft has the largest pipeline of datacenter buildouts pre-AI era. They have been grabing any and all colocation space they can as well aggressively increasing their datacenter buildouts.
Amazon have made press releases about nuclear powered datacenters totaling 1,000MW. They were the last of the hyperscale’s to wake up to AI.
Google, and Microsoft/OpenAI both have plans for larger than gigawatt class training clusters in the works.
From a supply perspective, sell side consensus estimates of 3M+ GPUs shipped by Nvidia in calendar year 2024 would correspond to over 4,200 MW of datacenter needs, nearly 10% of current global data center capacity, just for one year’s GPU shipments.
AI is only going to grow in subsequent years, and Nvidia’s GPUs are slated to get even more power hungry, with 1,000W, 1,200W, and 1,500W GPUs on the roadmap.
Nvidia is not the only company producing accelerators, with Google ramping custom inference accelerator production rapidly.
Going forward, Meta and Amazon will also ramp their in-house inference accelerators.

NVIDIA systems

DGX H100 SYSTEM

Specification	Details
GPUs	8x NVIDIA H100 Tensor Core GPUs
GPU Memory	640GB total
Performance	32 petaFLOPS
System Power Usage	~11.3kW max
Rack Units	8U
CPU	Dual 56-core 4th Gen Intel® Xeon® Scalable processors

DGX B200 SYSTEM

Specification	Details
GPUs	8x NVIDIA Blackwell GPUs
GPU Memory	1,440GB total GPU memory
Performance	72 petaFLOPS training and 144 petaFLOPS inference
System Power Usage	~14.3kW max
Rack Units	10U
CPU	2 Intel® Xeon® Platinum 8570 Processors 112 Cores total, 2.1 GHz (Base), 4 GHz (Max Boost)

NVIDIA GB200 NVL72

Specification	Details
Configuration	36 Grace CPU : 72 Blackwell GPUs
GPU Memory	1,440GB total GPU memory
Performance	1,440 petaFLOPS
System Power Usage	~120.0kW max
Rack Units	Full rack
CPU	2,592 Arm® Neoverse V2 cores

Data centers

Global data centers market

74 Countries

1004 Cities

3362 Data Centers

304 Providers

Top 10 Colocation Providers

Provider	Locations
China Telecom	382 locations
Equinix	246 locations
Digital Realty	231 locations
Amazon AWS	165 locations
Lumen	105 locations
Zenlayer	97 locations
MOD Mission Critical	94 locations
DataBank	69 locations
Cogent Communications	51 locations
Hivelocity	45 locations

North America data center trends H2 2023

In 2023, primary market supply grew 26% year-over-year to 5,174.1 MW.
An all-time high of 3,077.8 MW was under construction in primary markets, a 46% year-over-year increase. Construction increased most in Atlanta, growing by 211% to 732.6 MW under construction.
Northern Virginia had a 42% year-over-year price increase, the largest among primary markets.
Preleasing activity in primary markets is strengthening, with 2,553.1 MW (83%) of the 3,077.8 MW under construction preleased.
The overall vacancy rate for primary markets remains near a record low, at 3.7%. With few relocation options, most tenants are renewing existing leases rather than seeking new facilities.
Power availability continued to influence data center operators’ location decisions more than geography did.

H2 2023 wholesale primary market fundamentals

| Market | Inventory (MW) | Y-o-Y Change (MW) | Available MW/Vacancy Rate | 2023 Net Absorption (MW) | Rental Rates (kW/mo)** | | :— | :— | :— | :— | :— | :— | | Northern Virginia | 2,499.1 | ▲ 439.0 | 34.7 / 1.4% | 424.4 | $150-$190 | | Dallas-Ft. Worth | 565.3 | ▲ 173.1 | 41.6 / 7.4% | 155.2 | $135-$170 | | Chicago | 559.6 | ▲ 217.4 | 11.7 / 2.1% | 226.8 | $145-$155 | | Silicon Valley | 427.7 | ▲ 48.1 | 31.0 / 7.3% | 25.7 | $155-$250 | | Phoenix | 360.0 | ▲ 35.5 | 14.2 / 3.9% | 48.8 | $170-$200 | | Atlanta | 310.0 | ▲ 57.5 | 41.1 / 13.3% | 18.0 | $120-$130 | | Hillsboro | 262.4 | ▲ 94.0 | 6.5 / 2.5% | 93.3 | $125-$170 | | New York Tri-State| 190.0 | ▲ 12.5 | 12.3 / 6.5% | 14.1 | $170-$180 |

H2 2023 wholesale secondary market fundamentals

| Market | Inventory (MW) | Available MW/Vacancy Rate | 2023 Net Absorption (MW) | Rental Rates (kW/mo)** | | :— | :— | :— | :— | :— | | Central Washington | 186.4 | 0.4 / 0.2% | 20.9 | $135-$175 | | Austin/San Antonio | 162.2 | 2.8 / 1.8% | 6.7 | $140-$175 | | Southern California| 160.5 | 34.6 / 21.6% | 6.8 | $135-$160 | | Seattle | 138.9 | 10.0 / 7.2% | 10.6 | $135-$175 | | Houston | 134.1 | 26.5 / 19.7% | 19.2 | $140-$175 | | Denver | 92.9 | 17.2 / 18.5% | 8.2 | $135-$145 | | Minneapolis | 59.6 | 14.7 / 24.7% | -3.2 | $120-$175 | | Charlotte/Raleigh | 52.1 | 13.6 / 26.0% | -1.2 | $115-$130 |

Average asking rental rate with Y-o-Y % change for primary markets

2014 to 2017: A gradual decline from ~$145 to ~$130.
2017 to 2022: Rates hovered between ~$125 and ~$130.
2022 to 2023: A significant jump in price, with a Y-o-Y change of +18.6%. The average rate rose to over $150/kW/month.

Data center layout and constraints

While the DGX H100 server requires 10.2 kilowatts (kW) of IT Power, most colocation data centers can still only support a power capacity of ~12 kW per rack. Typical Hyperscale datacenter can deliver higher power capacity.
Server deployments will therefore vary depending on the power supply and cooling capacity available, with only 2-3 DGX H100 servers deployed where power/cooling constrained, and entire rows rack space sitting empty to double the power delivery density from 12 kW to 24 kW in colocation data centers.
As data centers are increasingly designed with AI workloads in mind, racks will be able to achieve power densities of 30-40kW+ using air cooling by using specialized equipment to increase airflow.
The future use of direct to chip liquid cooling opens the door to even higher power density by potentially reducing per rack power usage by 10% by eliminating the use of fan power, and lowering PUE by 0.2-0.3 by reducing or eliminating the need for ambient air cooling.

Density

The trend towards higher power density per rack is driven by networking, compute efficiency and cost per compute. Roughly 90% of colocation data center costs are from power and 10% is from physical space.
The data hall where IT equipment is installed is typically only about 30-40% of a data center’s total gross floor area, so designing a data hall that is 30% larger will only require 10% more gross floor area for the entire data center.
Considering that 80% of the GPU cost of ownership is from capital costs, with 20% related to hosting (which bakes in the colocation data center costs) the cost of additional space is 2-3% of total cost of ownership for an AI cluster.
Most existing colocation data centers are not ready for rack densities above 20kW per rack.
Chip production constraints will meaningfully improve in 2024, but certain hyperscale’s and colocation data centers run straight into a data center capacity bottleneck.
Limits of 12-15kW power in traditional colocation will be an obstacle to achieving ideal physical density of AI super clusters.

Data center constrains list

Some of the constrains are mentioned before, but there is more to consider:

Energy (3-5+ years to build capacity)
Energy density per rack (designing and cabling for 1MW+ of energy per 42U size rack)
Capabilities to build new data centers and execution (18-24 months for delivery)
Grid network connection and power distribution (transformers, switchgears, substation, transmission lines)
Thermal dissipation/cooling of server racks (challenge of cooling over 120kW and not long into the future 1MW per rack which opens two-phase cooling)
Networking (proximity between GPUs and positioning of racks, close is better and easier but not always possible)
Load/weight per square meter/square foot (static weight and point load)

Current pain points

Data center cunstructoin delays

Delay Duration	Percentage of Respondents
3-6 months	38.80%
6-9 months	16.30%
9-12 months	15.50%
12-18 months	14.70%
0-3 months	9.30%
18+ months	5.40%

Insufficient power planning

Factor	Importance
Availability of secure grid power	16.40%
Cost of power	13.30%
Access to renewable energy	11.50%
Availability of workforce and skills	11.20%
Capital expenditure	10.40%
Network proximity	10.20%
Tax breaks/financial incentives	8.10%
Favorable regulation	6.80%
Proximity to client populations	4.70%
Community acceptance	4.20%
Availability of water	3.40%

Data center math

Simple calculation based on 20,480 GPUs (H100)

1x H100 GPU = ~$30,000

20,480 x H100 = ~$614.40M capex for GPUs
1,389W per GPU -> 20,480 GPUs = 28.4 MW power

1MW can power ~ 1,000 homes

28.4 MW can power ~28,400 homes
28.4 MW * 0.083 USD/kWh = ~$20.70M electricity per year

Applying same formula, 100k cluster consumes ~140-150MW and will cost ~$100-130M in electricity per year.

Data center power usage in the US

Data center Critical IT Capacity in the US will need to triple from 2023 to 2027.

Metric	Units	2020	2021	2022	2023	2024	2025	2026	2027	2028
AI Data Center Critical IT Power	MW	318	640	1,102	3,332	8,499	16,356	28,140	41,337	56,280
Non-AI Data Center Critical IT Power	MW	14,231	16,395	18,376	19,221	19,798	21,382	23,520	25,637	27,175
Critical IT Power	MW	14,550	17,035	19,478	22,553	28,297	37,738	51,660	66,974	83,455
Utilization Rate	%	65%	66%	66%	67%	70%	72%	73%	74%	75%
Critical IT Power Consumed	MW	9,505	11,169	12,826	15,159	19,668	26,983	37,800	49,733	62,688
Power Usage Effectiveness (PUE)	Ratio	1.59	1.56	1.53	1.47	1.4	1.34	1.3	1.26	1.22
Data Center Utility Power Consumed	MW	15,142	17,407	19,660	22,323	27,538	36,263	48,957	62,521	76,684
Data Center Actual Power Usage, per year	TWh	133	152	172	196	241	318	429	548	672
As % of United States Power Generation	%	3.30%	3.70%	4.00%	4.50%	5.50%	7.10%	9.50%	12.00%	14.60%

Infrastructure and energy

US, East Asia, Western Europe and Middle East - overview

The energy supply situation in the US stands in stark contrast to East Asia and Western Europe, which host about 15% and 18% of global data center capacity, respectively. While the US is self-sufficient in natural gas, countries such as Japan, Taiwan, Singapore, and Korea import well over 90% of their gas and coal needs.
In Western Europe, electricity generation has been slowly declining, with a 5% drop cumulatively over the past five years. One reason for the drop is that nuclear power has become a political non-starter, causing nuclear power generation to decline massively.
A strong focus on the “environment” has led to dirty fuel sources such as coal also declining dramatically over the same time, although the cleanest power in the world nuclear has been replaced with coal and natural gas in some instances.
Renewable energy is increasing within Europe’s power mix, but not fast enough, leaving many Europeans countries to scramble to pivot more towards natural gas, which now stands at 35-45% of the power generation mix for major Western European countries.
Europe is slowed building with many regulations and restrictions on the data center and manufacturing industries already in place. While small projects and pipelines for data centers are in progress, especially in France who at least has somewhat realized the geopolitical necessity, no one is planning to build Gigawatt class clusters in Europe.
Europe has less than 4% of globally deployed AI Accelerator FLOPs based on estimates.
Middle East is opening as a new strategic hub in geopolitical relationship with US. ASIC companies like Groq and Cerebras are having tight relationships with Middle East and most revenue coming from that region.

Electricity generation - selected regions

China: Shows dramatic and steep growth, starting below the US in 2000 and rising sharply to become the world’s largest electricity generator, exceeding 8,000 TWh by 2023.
United States: Generation has remained relatively flat, hovering around 4,000 TWh for the entire period.
European Union (27): Shows a slight decline over the period, starting above 2,500 TWh and ending slightly below it.
Japan, South Korea, France, Germany, United Kingdom, Taiwan: All show relatively flat or slightly declining generation, clustered much lower on the chart (all below 1,000 TWh).

Building out AI infrastructure at scale

The AI data center industry is going to need the following:

Inexpensive electricity costs given the immense amount of power to be consumed on an ongoing basis, particularly since inference needs will only compound over time.
Stability and robustness of energy supply chain against geopolitical and weather disturbances to decrease likelihood of energy price volatility, as well as the ability to quickly ramp up fuel production and thus rapidly provision power generation at great scale.
Power generation with a low carbon intensity power mix overall, and that suitable to stand up massive quantities of renewable energy that can produce at reasonable economics.

Countries that can provide this are contenders to be Real AI Superpowers.

This underscores a strong case for nuclear energy as a future source, capable of providing hundreds of megawatts to over gigawatt of electrical energy.

No energy source surpasses nuclear in terms of energy density, stability, predictability over extended periods, and safety.

Conclusion

Europe has less than 4% of global AI Accelerator FLOPs and is falling behind. Politics limit energy to support new builds or parts to be an AI superpower.
Most people knew about chip or GPU shortages over the last few years. That segment is slowly recovering and may recover first. Chip production is mainly done in Taiwan by TSMC, but is resolving by building chip fabs on US ground.
The biggest problem is energy, then energy density. The US and Europe had little growth in energy production over the last decade. Getting new energy sources and increasing production is hard. The energy grid is the next problem since it wasn’t designed for those loads and lacks investment to upgrade (on average 40-50 years old).
After solving the issue of delivering over 1MW per rack in data centers, power is needed to cool equipment with two-phase cooling. Gigawatt data centers will be a huge challenge. Swings in power can blow up grid and will need stabilization.
Networking in data centers is the next step. If GPU servers are far apart, more fiber/cabling is needed, increasing costs. The short reach enables lower cost multimode optical transceivers as opposed to expensive single mode.
Parts of the problems can be solved using nuclear energy and building data centers near nuclear power plants. Solar and BESS (Battery Energy Storage System) are much faster deployed and more scalable in current environment.
Most currently built data centers are powered with natural gas and gas turbine supply chain can’t follow demand.
We have not resolved any part of the supply chain to meet AI demand for the next decade, with energy being the most challenging issue.