Insights

Where the math is defensible.

Long-form research on live enterprise decisions. Publication is selective. Every number traces to a named source. No takes without evidence.

Filtering: Tag: nvidia Clear

AI and compute economics 2026-04-26 11 minute read 15 sources

Compute behind a fence: US AI export controls in 2026

Four years of BIS rules have built a tiered global compute regime. The October 2022 baseline, the October 2023 patch, the December 2024 HBM and tooling rules, and the January 2025 AI Diffusion Framework now shape every chip flow worth tracking. DeepSeek R1 forced the harder question. Are the controls working.

United States chip export controls have moved from a narrow performance ceiling into a full architecture for who gets to compute at frontier scale. The October 7, 2022 BIS rule set the first thresholds. October 17, 2023 closed the H800 and A800 workaround. December 2, 2024 added high bandwidth memory and 24 wafer fabrication tools to the ...

AI and compute economics 2026-04-26 11 minute read 21 sources

AI inference economics in 2026: GPT, Claude, Gemini, and the pricing war that is rewriting the application stack

Token prices are falling roughly 10x per year at constant capability, the marginal frontier provider is now a Chinese open weight lab, and hyperscaler capex is committing 325 billion dollars to a market where the unit price keeps collapsing.

Inference is the largest unsolved cost line in enterprise AI. Output token prices for frontier general capability have fallen from 60 dollars per million in 2023 to between one and three dollars in 2026, a roughly 20 to 60 fold compression. Epoch AI estimates a 10x annual decline at constant capability, sustained for three years, driven b...

ai inference gpt claude gemini tokens nvidia mlperf

AI compute and energy 2026-04-26 12 min read 12 sources

The Custom Silicon Insurgency Against Nvidia in 2026

AWS Trainium 2, Google TPU v5p and Trillium, Microsoft Maia, Meta MTIA, and a possible OpenAI ASIC are reshaping where AI compute margin lives, but the binding constraint sits one layer down at HBM and CoWoS-L.

The hyperscalers spent 2024 and 2025 telling investors that custom silicon would relieve their dependence on Nvidia, and 2026 is the year those claims start meeting empirical scrutiny. AWS has stood up Project Rainier for Anthropic at a publicly disclosed scale of more than one million Trainium 2 chips. Google has placed its TPU v5p gener...

custom silicon nvidia ai infrastructure gpus tpu trainium data centers tsmc hbm cuda

AI and compute economics 2026-04-26 10 minute read 12 sources

Korea memory in 2026: Samsung versus SK Hynix, NVIDIA qualification, and the HBM share war

SK Hynix turned a two year qualification lead at NVIDIA into roughly half of the global HBM market and most of its profit pool, while Samsung is rebuilding its memory business around a delayed HBM3E ramp, an HBM4 catch up plan, and a foundry separation that signals how serious Suwon now treats the gap.

High bandwidth memory has become the single most concentrated profit pool in the AI compute stack outside NVIDIA itself. TrendForce sized the HBM market at roughly USD 16 billion in 2024 and projects USD 33 billion in 2025, with SK Hynix holding about 53 percent share, Samsung 38 percent, and Micron 9 percent in the fourth quarter of 2024...

korea samsung sk-hynix hbm nvidia memory gpu yongin

AI and compute economics 2026-04-26 11 minute read 17 sources

UAE G42, sovereign AI ambitions, and the US China tech triangulation through 2026

G42 spent 2024 and 2025 converting Abu Dhabi political capital into US compute access, Chinese hardware divorce, and a balance sheet position inside the largest AI deals on the planet. Microsoft put 1.5 billion dollars on the cap table in April 2024, the Bureau of Industry and Security wrote a Diffusion framework in January 2025 that named the UAE explicitly, and Stargate UAE pushed five gigawatts of campus capacity to Abu Dhabi. The story is no longer whether the Gulf builds sovereign AI. The story is whose chips, whose models, and whose security guarantees the buildout runs on.

G42 was incorporated in Abu Dhabi in 2018 as a Mubadala adjacent holding company chaired by Sheikh Tahnoon bin Zayed Al Nahyan, the UAE National Security Advisor and brother of the President. Through 2023 the group accumulated stakes in TikTok parent ByteDance, Huawei surveillance, BGI Genomics sequencing partnerships, and a Wuxi cloud jo...

uae g42 ai microsoft nvidia saudi-arabia compute sovereignty

AI and compute economics 2026-04-26 13 minute read 12 sources

Trump's AI Action Plan: Compute Sovereignty, Export Control Reset, and the Frontier Regulatory Vacuum

Executive Order 14179, the rescinded AI Diffusion Framework, the Stargate USD 500B compute commitment, and a renamed AI Safety Institute together redraw US AI governance around speed, capital, and selective export pressure rather than precaution.

On January 23, 2025, President Trump signed Executive Order 14179, Removing Barriers to American Leadership in Artificial Intelligence, rescinding Biden's EO 14110 and ordering an AI Action Plan within 180 days. By July 2025 the plan launched with a 60 day Request for Information that drew submissions from OpenAI, Anthropic, Google, Meta,...

ai policy trump ai action plan stargate nvidia ai diffusion nist

AI and compute economics 2026-04-23 12 minute read 5 sources

The 3.2x compute curve: what GB200 actually changes about training ROI

NVIDIA says 4x. A CFO needs the number closer to 3.2x, and only above 55 percent utilization.

NVIDIA positioned GB200 NVL72 as a 4x training improvement over H100. On a workload-weighted, all-in-cost basis the number is closer to 3.2x, and only if sustained utilization clears a 55 to 60 percent threshold. This brief unpacks the curve, where the gains come from in hardware terms, and the three places the ROI tends to break in real ...

ai compute capex training infrastructure nvidia blackwell