Training Built the US. Inference Will Build Everything Else.

Bijan Alizadeh·

March 30, 2026

In the six months between August 2025 and February 2026, analyst consensus for data centre capital expenditure in fiscal year 2027 was revised upward by nearly $300 billion – a 57% increase – to over $800 billion. BloombergNEF now projects cumulative data centre capex of $3.3 trillion through 2029. These are not speculative figures. They reflect committed capital from the twenty largest publicly listed data centre operators, led by Amazon, Google, Meta, and Microsoft, which together account for 80% of the total. The scale of infrastructure being built is without precedent. The question that matters for allocators is not whether this capital gets deployed. It is where it goes – and what kind of compute it serves.

For the past three years, the answer was straightforward: training. Massive centralised clusters of GPUs, enormous single-site power draws, and a land grab concentrated overwhelmingly in the United States. The US still hosts 15.9 gigawatts – 67% – of all data centre capacity currently under construction globally. The logic was simple: training frontier models requires thousands of GPUs communicating at extraordinary bandwidth over short distances. You need the chips in one place, the power in one place, and the cooling in one place. The US had the infrastructure head start, and it ran with it.

That logic is now obsolete.

The fundamental nature of AI compute has changed. Deloitte estimates that inference – the actual running of AI models in production – now accounts for two-thirds of all AI compute, up from roughly half in 2025 and a fraction of the total just two years before that. Enterprise surveys show inference consuming 85% of organisational AI budgets. The inference infrastructure market alone more than doubled in a single year, from $9.2 billion to $20.6 billion. Brookfield projects inference will reach 75% of all AI compute needs by 2030.

This is not a gradual evolution. It is a structural break. And it changes the entire geographic calculus of where data centre capital should flow.

Here is why. Training is a centralisation problem. You need massive GPU clusters in a single location with ultra-high-bandwidth interconnects – the kind of setup that favours existing hyperscale hubs where the infrastructure was built first. Inference is a distribution problem. When a user in Frankfurt asks an AI assistant a question, or an agentic system in Jakarta executes a multi-step workflow, that query needs to be processed close to the user. Latency matters. NVIDIA’s new AI Grid architecture, unveiled at GTC 2026, is designed explicitly for geographically distributed inference – sub-500 millisecond response times across distributed edge networks. Akamai is deploying NVIDIA Blackwell GPUs across more than 4,400 edge locations globally. The architecture of inference itself demands geographic distribution.

The rise of agentic AI makes this permanent. We are no longer talking about chatbots responding to single prompts. Agentic systems execute multi-step workflows autonomously – booking travel, managing supply chains, processing compliance checks, orchestrating financial transactions – each step generating multiple inference calls in sequence. ChatGPT alone now serves 900 million weekly active users. Anthropic holds 40% of the enterprise LLM API market. OpenAI’s own revenue projections now break out “Agents” as a distinct and rapidly growing category alongside API and ChatGPT. Every agentic interaction is an inference workload, and every inference workload is latency-sensitive. This is not a temporary spike. It is a permanent baseload of distributed compute demand that will only grow as AI embeds into enterprise operations globally.

NVIDIA’s $20 billion acquisition of Groq’s inference technology underscores the point. The entire GTC 2026 keynote was oriented around inference-specific hardware: the LPX rack system, Attention FFN Disaggregation, the CMX context memory platform, the Vera ETL256 CPU rack. These are not training systems. They are purpose-built for a world where inference – not training – is the dominant workload. Jensen Huang pointed to a potential $1 trillion opportunity in AI systems over the next few years. The infrastructure to capture that opportunity will not look like the infrastructure that trained the models.

Meanwhile, the US market that built the training era is hitting physical limits. PJM Interconnection’s capacity prices – the wholesale electricity market covering states across the Mid-Atlantic and Midwest and 65 million people – surged from $28.92 per megawatt-day to $329.17, an increase of more than 11 times in two years. The December 2025 auction failed to meet its reliability target for the first time in PJM’s history, with data centres responsible for 63% of the price increase. At least six US states have introduced data centre construction moratoriums, and seven have moved to repeal or restrict tax incentives worth billions annually. Starting June 2026, ratepayers across the PJM region will collectively pay an additional $1.4 billion in capacity costs driven largely by data centre demand. The response has been telling: BNEF now tracks 114 gigawatts of on-site gas generation capacity across 115 projects at US data centres – equivalent to 52% of the entire US data centre pipeline. When operators are building their own power plants to bypass the grid, the grid constraint is not theoretical. It is structural.

The US is not going to stop building data centres. But every marginal megawatt of new capacity is getting dramatically more expensive, politically more contested, and physically harder to deliver. For institutional capital seeking to deploy into AI infrastructure, the risk-adjusted returns in the US are compressing precisely as opportunities elsewhere are opening up.

Europe is the most immediate beneficiary. The European Data Centre Association projects €176 billion ($208 billion) in cumulative investment between 2026 and 2031, with total capacity – operating and planned combined – at 24.4 gigawatts and pipeline growing at 43% annually. Yet today, EMEA has just 2.9 gigawatts under construction compared to 17 gigawatts in the Americas. That gap is the opportunity. What makes Europe compelling for inference specifically is a combination of regulatory pull and structural readiness. GDPR’s data sovereignty requirements mean European user data increasingly needs to be processed on European soil. Real-time and agentic AI applications need sub-500ms latency – you cannot serve a user in Berlin or Milan from a data centre in Virginia and meet that threshold. The EU’s EuroHPC “AI Factories” programme and national AI strategies in France, the UK, and the Nordics are creating explicit policy tailwinds for sovereign AI infrastructure.

The capital is already moving. Nscale committed to deploying over 100,000 NVIDIA Vera Rubin GPUs to Europe by 2027 and secured a $1.4 billion delayed draw loan to fund GPU deployment across Norway, Portugal, Iceland, and the UK. Its March 2026 Series C valued the company at $14.6 billion – the highest of any privately held neocloud tracked by BNEF – and it is increasingly positioning itself as a European sovereign AI champion. Nebius raised $4 billion in convertible notes to accelerate European AI infrastructure deployment, anchored by a new deal with Meta. PoweringAI launched as a dedicated vehicle to convert European industrial legacy sites into AI infrastructure. In a single week in March 2026, new data centre projects were announced in Poland (gigawatt-scale), Denmark, Germany, Spain, Estonia, Norway, and Finland – several of them on repurposed industrial sites, including a former paper mill.

Europe’s traditional hubs are not without their own constraints. Amsterdam has limited new developments until at least 2035. Frankfurt’s AI developments are paused pending new grid capacity. The UK grid queue has surged 460%, and Edinburgh approved a temporary ban on AI data centres. But this is exactly the dynamic that makes the inference thesis interesting: the constraint in legacy hubs is forcing capital into secondary markets – the Nordics, the Baltics, Iberia, Central and Eastern Europe – where power is available, government incentives are strong, and the greenfield opportunity is real. We are seeing this ourselves, with active opportunities in European markets that would not have existed eighteen months ago.

Southeast Asia is the second front. The region’s data centre market is projected to more than double to $30.5 billion by 2030, with capacity expansion of 4,620 MW – an increase of 180% – driven by pipeline growth of 350% in Malaysia, 250% in Indonesia, and 200% in Thailand. Hyperscalers have committed over $20 billion in announced capex across the region. BNEF data confirms Asia-Pacific has 3.2 gigawatts under construction – already exceeding EMEA – with Malaysia ranking third globally in capacity additions.

The region’s 700 million population is an increasingly digitised consumer base that generates inference demand locally. Singapore provides the financial and connectivity backbone while facing land and power constraints of its own, pushing capacity into neighbouring Malaysia’s Johor corridor and Thailand’s industrial zones. Digital Edge secured Indonesia’s largest data centre green loan at $665 million. GMI Cloud unveiled a $12 billion sovereign AI initiative in Japan. Thailand approved $2.7 billion in data centre investment applications in a single month. The buildout is happening at a pace that mirrors the US trajectory of three years ago – but oriented around inference from the start.

We are seeing this first-hand. Deals are crossing our desk in Southeast Asian industrial zones and European markets that would have been unthinkable two years ago – infrastructure-grade opportunities with clear power access, available land, and government incentives specifically designed to attract AI compute capacity. The structure we apply to these opportunities is the same three-layer model we have written about before: separate the infrastructure (PropCo), the compute (ComputeCo), and the energy (PowerCo), and finance each layer with the capital best suited to its risk profile. What has changed is not the structuring framework. What has changed is the geography in which it is being applied.

There are real risks that sophisticated allocators will price in. US export controls on advanced AI chips constrain what can be deployed in certain jurisdictions. While inference hardware is generally less affected than training clusters – purpose-built inference chips increasingly use non-TSMC supply chains – the regulatory trajectory is tightening, not loosening. The latest round of controls specifically targets advanced chips destined for certain markets, and any operator deploying outside the US needs to build export compliance into their procurement strategy from day one. This is a structuring problem, not a deal-breaker, but it requires jurisdiction-specific solutions that many allocators are not yet equipped to navigate. Grid readiness varies dramatically across secondary European markets. Some Southeast Asian markets lack the operational talent depth to run mission-critical AI infrastructure at scale.

The window for early movers is measurable. Neocloud leasing contracts are being signed now for delivery starting in 2026 and 2027 – BNEF tracked over $50 billion in disclosed hyperscaler-neocloud offtake agreements in just three months last year, with contract values still accelerating. The AI labs driving demand are not slowing down: they raised $160 billion in the first two months of 2026 alone, led by OpenAI’s $110 billion round at an $840 billion valuation and Anthropic’s $30 billion Series G. OpenAI is targeting $600 billion in infrastructure spending through 2030. Gross margins on inference services are running at 30–40%, validating the underlying economics. The capital is being committed now. The infrastructure positions are being locked in now. Allocators who wait for the geographic thesis to become consensus will find that the best sites, power access, and government incentive packages have already been taken. The next twelve to eighteen months will determine who captures the first wave of inference-era value outside the US.

The training era concentrated AI infrastructure in a handful of US locations because the physics of training demanded it. The inference era will distribute AI infrastructure globally because the economics and the architecture demand it. Dell’Oro projects data centre capital expenditure reaching $1.7 trillion by 2030. The question for allocators is not whether that capital will be deployed. It is where – and how it is structured when it gets there.

The US built the training layer. Europe and Southeast Asia will build the inference layer. The investors who recognise this shift early will structure accordingly. The ones who don’t will be buying US capacity at the top of the market while the next wave of value creation happens elsewhere.

Bijan Alizadeh is Founder and Chairman of Storm Group and Cypher Capital.

→ Back to Insights