Compute Cost Curve
Six-step GPU TCO decomposition across acquisition, power, cooling, network, depreciation, and idle.
Problem solved
Hyperscaler procurement teams, neocloud capex committees, and PE infrastructure investors need an apples-to-apples cost per effective FLOP across chip generations and deployment archetypes. Spec sheets do not produce this number. The framework produces it under named utilization, useful-life, and energy assumptions, with the sensitivity ranking that matters for procurement.
Inputs
- Acquisition price per accelerator (H200, B200, GB200, MI300X, TPU v5p, custom), with 12 month forward curve where observable
- MLPerf training and inference results, normalized to dense FP8 throughput
- Power draw at sustained load and idle, including NVLink and host CPU overhead
- Cooling architecture (air, direct-to-chip, immersion) with PUE assumption by site type
- Industrial electricity tariff at the site (EIA 861, ISO/RTO LMP, named PPA)
- Network fabric (NVLink, NDR Infiniband, RoCE) and the implied scaling efficiency from MLPerf strong-scaling tables
- Useful-life and depreciation schedule (24, 36, or 48 month base cases)
- Utilization profile by workload (training-heavy, inference-heavy, mixed) with idle fraction
Outputs
- All-in dollars per effective FP8 PFLOP-hour at named utilization
- Channel decomposition: capex per hour, power per hour, cooling per hour, network amortization per hour, idle drag
- Tornado chart of TCO sensitivity to electricity price, utilization, useful life, and PUE
- Dollars per million tokens for inference workloads at named batch sizes and quantization
- Crossover utilization: the point at which a newer chip undercuts an older chip on TCO terms
- Scenario tree for forward chip prices and electricity prices
Method
- Step 1. Normalize throughput. Take MLPerf dense FP8 training and inference results, scale to the workload mix in the engagement scope, and convert to effective PFLOP-hours. NVLink, Infiniband, and weak-scaling versus strong-scaling deltas are kept separate.
- Step 2. Build the capex line. Acquisition price per accelerator plus the network fabric, host servers, and rack-level integration. Spread across the useful-life base case (default 36 months for training, 48 months for inference). Salvage value defaults to 5 percent.
- Step 3. Build the power line. Sustained load times utilization plus idle draw times idle fraction, multiplied by the site electricity tariff. Behind-the-meter PPA prices override grid tariffs where contractually applicable.
- Step 4. Build the cooling line. Apply PUE by cooling architecture: 1.40 for legacy air, 1.20 for direct-to-chip, 1.05 for immersion. Multiply IT load by PUE minus one to get cooling overhead. Water-side cooling adds a separate water-cost line for jurisdictions where water is constrained.
- Step 5. Build the network line. NVLink within the rack is amortized into capex; cross-rack NDR Infiniband or RoCE has its own capex amortization plus power. Scaling efficiency loss above 1,024 GPUs gets a tax that lowers effective FLOPs.
- Step 6. Sum, divide, sensitivity. Add all lines, divide by effective FLOP-hours at named utilization, run the tornado on electricity price, utilization, useful life, and PUE. Report the crossover utilization where the candidate undercuts the incumbent.
Assumptions
- MLPerf dense FP8 throughput is the comparison currency. Sparse and reduced-precision throughput is reported separately as a sensitivity case.
- Useful life defaults to 36 months for training and 48 months for inference. Hyperscalers are increasingly running 60-month inference deployments; the framework reports both.
- Idle draw is a real cost, not a rounding error. The framework requires an explicit idle-fraction input.
- Electricity price is annualized and held constant within the depreciation window for the base case. A separate scenario tree handles forward-curve sensitivity.
- NVLink and Infiniband scaling losses follow the published MLPerf strong-scaling tables. Custom fabrics require an engagement-specific calibration.
Limitations
- Real-world utilization at hyperscaler scale is rarely the spec-sheet utilization. The framework requires the operator's own utilization data; in the absence of it, results carry an explicit utilization-band warning.
- Custom accelerator pricing is often opaque. The framework runs three pricing scenarios (low, mid, high) where contracts are confidential.
- Software stack maturity (CUDA, ROCm, custom) does not appear directly in the cost stack but is captured implicitly through MLPerf throughput.
- Inference token economics are workload-specific. Generic dollar per million tokens numbers are not portable across model class, context length, or batching strategy.
Example application
Applied to a 2026 hyperscaler procurement decision: H200 versus B200 versus GB200 NVL72 for a 50,000 accelerator training cluster on PJM with a 120 dollar per MWh blended power tariff and 36 month depreciation. The framework runs the six steps, produces the dollar per effective FP8 PFLOP-hour for each candidate at 65 percent and 80 percent utilization, and identifies the utilization crossover where GB200 economics dominate H200 even at the higher acquisition price. See Hyperscaler GPU procurement 2026.
Where the method has been applied.
Hyperscaler GPU Procurement 2026: H200 vs B200 vs GB200 in Honest Deployment Math
Blackwell is no longer a roadmap promise, it is a procurement reality, and the only honest comparison runs on workload-weighted utilization rather than peak FLO...
Read brief → 2026-04-26AI inference cost decline 2026: the trajectory and what it forces buyers to plan for
Token prices have fallen roughly 10x per year for equivalent capability since 2023, and the buyers who treat inference as a fixed line item are mispricing every...
Read brief → 2026-04-25AI capex met the grid: when the megawatt curve breaks
Hyperscaler capital spending crossed 500 billion dollars across 2025 and 2026 while the average US interconnection wait sits above 4 years. The constraint is no...
Read brief → 2026-04-26AI Talent Compensation 2026: Where Comp Is Going Across Labs, Hyperscalers, and Finance
Frontier labs, hyperscaler ML orgs, and quant funds are converging on a narrow pool of researchers, with equity scaling on private valuations and skill premiums...
Read brief → 2026-04-25Quebec hydropower and the new gating of AI compute
Quebec spent two decades selling itself as the cheapest, greenest place on the continent to plug in a data center. In 2026 Hydro-Quebec is throttling new connec...
Read brief → 2026-04-26Compute behind a fence: US AI export controls in 2026
Four years of BIS rules have built a tiered global compute regime. The October 2022 baseline, the October 2023 patch, the December 2024 HBM and tooling rules, a...
Read brief → 2026-04-26Small modular reactors meet the hyperscaler load curve
Eighteen months after the Google Kairos and Amazon X-energy announcements, the SMR thesis has moved from PowerPoint to procurement. The binding constraints are ...
Read brief → 2026-04-26AI inference economics in 2026: GPT, Claude, Gemini, and the pricing war that is rewriting the application stack
Token prices are falling roughly 10x per year at constant capability, the marginal frontier provider is now a Chinese open weight lab, and hyperscaler capex is ...
Read brief → 2026-04-26Korea memory in 2026: Samsung versus SK Hynix, NVIDIA qualification, and the HBM share war
SK Hynix turned a two year qualification lead at NVIDIA into roughly half of the global HBM market and most of its profit pool, while Samsung is rebuilding its ...
Read brief → 2026-04-26Frontier AI training cost trajectory 2026: the run rate, the deal stack, and the power-bound horizon
Frontier pretraining budgets crossed the half billion mark in 2025 and are heading toward one to three billion dollars per model by 2027, with cluster power, no...
Read brief → 2026-04-26The Custom Silicon Insurgency Against Nvidia in 2026
AWS Trainium 2, Google TPU v5p and Trillium, Microsoft Maia, Meta MTIA, and a possible OpenAI ASIC are reshaping where AI compute margin lives, but the binding ...
Read brief → 2026-04-26ASEAN Sovereign AI in 2026: Models, Compute, and the Regulatory Patchwork
Singapore is buying TPU access while building SEA-LION, Indonesia is shipping Sahabat-AI in five languages, Thailand is scaling Typhoon, Malaysia is funding chi...
Read brief →Related methods.
FEOC Stack
Foreign-entity-of-concern decomposition for IRA Section 30D and 45X bills of materials.
Pass-Through DecompTariff Pass-Through Decomposition
Five-step decomposition of who pays a tariff: importer margin, exporter price, or domestic consumer.
Multiplier BenchFiscal Multiplier Bench
Country-level fiscal multiplier estimates by spending category, business cycle phase, and monetary regime.
AI Siting ScoreEnergy-AI Siting Score
Composite jurisdiction ranking for AI compute siting across firm power, water, latency, talent, and policy stability.
IPMIIndustrial Policy Maturity Index
Five-dimensional score of whether a country's industrial policy is investible, across design, disbursement, conditionality, monitoring, and political durability.
Substitution MapTariff-Substitution Elasticity Map
Estimating which third-country supplier captures share when a tariff hits the primary origin, by HS6, with confidence bands.
Restructuring StackSovereign Restructuring Stack
Ordering of sovereign debt instruments by haircut tolerance, legal seniority, and political optics.
CMCICritical Minerals Concentration Index
HHI-based concentration scoring on extraction, processing, and refining for any critical mineral chain.
ACASAI Capex Absorption Score
Quantifies whether a region or grid can absorb a hyperscaler buildout in dollars, megawatts, megaliters, and engineers.
SDPSSovereign Default Probability Stack
Six-factor early-warning framework for emerging-market sovereign credit, calibrated to the post-Common Framework dataset.
EOTMElection Outcome Translation Matrix
From electoral results to policy probability, with named instrument and calendar bindings.
CBPSCross-Border Payments Stack
SWIFT, CIPS, INSTEX, BRICS Pay, mBridge, and stablecoin rails scored on throughput, sanction durability, and counterparty network.
CABSClimate Adaptation Bond Score
Adaptation-finance instrument grading on additionality, measurability, and sovereign-credit-quality interaction.