Agentbrisk

AI Energy Consumption in 2026: The Real Numbers on Data Centers and Inference

April 10, 2026 · Editorial Team · 7 min read · ai-energysustainabilitydata-centers

The question of how much energy AI actually uses has been buried under years of vague estimates, corporate press releases, and overclaiming in both directions. Some people insist AI will destabilize the power grid within two years. Others dismiss the concern as exaggerated. The actual numbers sit somewhere more specific and more nuanced than either framing.

This post collects the real figures that are available, explains what they mean, and separates what we know from what we're still guessing at.


Training vs. inference: two very different problems

Most discussions of AI energy use conflate two separate things: training large models and running inference on those models once they're trained.

Training is a one-time (or occasional) cost. Training GPT-4 was estimated at roughly 50,000 megawatt-hours (MWh), based on compute estimates and typical GPU power draw. Training Llama 3 70B required approximately 6,000 MWh. These are large numbers but one-time expenditures. A large coal plant generates about 5,000-6,000 MWh per day. Training GPT-4 was roughly equivalent to running that plant for 8-10 days.

Inference is the continuous, ongoing cost. Every query processed, every document analyzed, every image generated, every API call made draws power. And because frontier model usage has grown dramatically, inference now accounts for a larger share of AI's total energy footprint than training.

Goldman Sachs estimated in 2024 that by 2025-2026, inference would represent 70-80% of total AI power consumption. That shift matters because inference is not a one-time cost that you amortize over time. It scales with usage, and usage keeps growing.


Per-query energy costs

The per-query energy numbers depend heavily on the model size and deployment infrastructure, but there are credible published estimates.

A single ChatGPT query (GPT-4 class): Approximately 0.001 to 0.003 kWh (1-3 watt-hours). The IEA's 2024 report on AI energy use used roughly 2.9 Wh per ChatGPT query as an estimate. For comparison, a Google search uses about 0.0003 kWh (0.3 Wh), roughly 10 times less.

A single image generation request: Higher than text. Generating an image with Stable Diffusion or Midjourney uses roughly 0.002 to 0.01 kWh depending on the model size and number of inference steps. Midjourney's infrastructure runs heavy parallel GPU batches that are difficult to estimate precisely.

A Claude API call with a 10K token context: Rough estimate around 0.001-0.005 kWh, heavily dependent on output length. Longer outputs cost more energy.

At ChatGPT's reported usage of roughly 100 million daily active users making an average of 2-3 queries each, that's 200-300 million queries per day. At 2.9 Wh each, that's roughly 580-870 MWh per day for ChatGPT inference alone. That's about 600 GWh per year, comparable to the electricity consumption of a small European city (100,000-200,000 people).


Data center demand growth

Data center power demand is where AI's energy impact becomes most tangible. Data centers consumed about 200-250 TWh globally in 2022. The IEA's 2024 projections put AI-adjacent data center demand potentially doubling by 2026 to 400-500 TWh or more.

In the United States, the scale is more concrete. Northern Virginia, which houses the world's largest concentration of data centers (earning it the unofficial title "data center capital of the world"), has been straining the regional power grid. Dominion Energy, which supplies much of the region, has been warning since 2023 that planned data center expansion would require significant grid upgrades. Their projections show data center load in Virginia growing from about 4,000 MW in 2022 to potentially 13,000-15,000 MW by 2028.

Microsoft, Google, Amazon, and Meta have collectively announced hundreds of billions in data center investment through 2026-2028. Microsoft alone announced $80 billion in data center spending for fiscal year 2025. Even discounting some of that as capacity that won't all come online on schedule, the scale of new power infrastructure being planned is substantial.


What the companies say about carbon

The major AI companies have made ambitious carbon commitments that look increasingly hard to meet given the infrastructure buildout underway.

Google pledged to be carbon-free by 2030 (not carbon-neutral, but actually powered by carbon-free sources on a 24/7 matched basis). Their 2024 environmental report showed that total greenhouse gas emissions had increased 48% since 2019, driven by data center energy use. The company acknowledged that their 2030 carbon-free goal is "more challenging to achieve" given AI growth.

Microsoft had pledged to be carbon negative by 2030 and remove all historical carbon by 2050. Their 2024 sustainability report showed emissions up 29% from 2020, also attributed to data center expansion for AI.

Amazon (AWS) claims to be on track to power operations with 100% renewable energy. Amazon matched 100% of its electricity consumption with renewable energy purchases in 2023. However, "matching" renewable purchases with consumption on an annual basis is much weaker than actual 24/7 carbon-free power. When the grid is running on gas at 11 PM, buying a renewable credit for solar power generated during the day doesn't mean the data center is actually carbon-free at that moment.

The gap between stated commitments and actual progress is real and publicly documented in their own reports. The IEA noted that while the tech industry has been a driver of renewable energy investment, the pace of AI infrastructure growth is outrunning the pace of clean power development.


Water use: the overlooked metric

Energy isn't the only resource. AI data centers are also significant consumers of water for cooling.

A 2023 paper from researchers at UC Riverside estimated that ChatGPT uses about 500 milliliters (0.13 gallons) of water per 20-50 chat interactions, mostly for cooling at Microsoft's data center operations. This is water that evaporates and doesn't return to local watersheds.

At scale: Microsoft reported using approximately 6.4 million cubic meters of water in 2023 for its global operations, up 34% from 2022. Google reported 5.6 billion gallons (about 21 million cubic meters) of water consumption in 2023.

Water use matters most in regions where water is scarce. Phoenix, Arizona, and the desert southwest have large concentrations of data centers and chronic water stress. Some localities have begun restricting new data center construction specifically due to water use concerns.


The efficiency trajectory

The numbers above look alarming in isolation. The countervailing factor is that energy efficiency per computation has been improving dramatically and consistently.

GPU energy efficiency (performance per watt) has roughly doubled every 2-3 years. NVIDIA's H100 is substantially more energy-efficient per FLOP than the V100 from five years ago, which was more efficient than the P100 before it. Model efficiency has also improved: techniques like quantization, distillation, speculative decoding, and better architectures mean you can produce similar quality output with less compute.

The question is whether efficiency gains outpace usage growth. Historical precedent from previous computing waves (cloud, mobile, streaming video) suggests they don't: efficiency improvements make computation cheaper, which drives more usage, which grows total energy consumption even as per-unit efficiency improves. This is sometimes called the rebound effect.

The IEA has projected that if AI inference workloads continue growing at 2024 rates, global AI energy demand could reach 1,000-1,200 TWh by 2030, roughly the current annual electricity consumption of Japan. Whether that happens depends on usage growth curves that are genuinely hard to predict.


What this means practically

For developers and businesses building on top of AI APIs, the energy footprint of your application is real. Some practical ways to think about it:

Use smaller models when they're good enough. A Llama 3.3-8B call uses roughly 10-50x less energy than a GPT-4o call for the same task. If your use case doesn't require frontier model capability, using a smaller model is both cheaper and lower-energy.

Batch requests where possible. Running inference on many inputs in a single batch is more energy-efficient than many individual requests because GPU utilization is higher during batched computation.

Cache aggressively. If the same query or similar queries come up repeatedly, caching the response avoids redundant computation entirely.

The broader industry response is more complicated. Renewable energy procurement is happening at large scale, but the grid transition takes time. Carbon accounting methods vary and some are more meaningful than others. The honest position is that AI energy use is a real and growing part of global electricity demand, efficiency improvements are real but not keeping pace with growth, and the clean energy transition in power generation is the variable that matters most for the long-term environmental trajectory.

The numbers aren't reassuring, but they're also not apocalyptic. They're a large industrial sector with a genuine environmental footprint, one that's growing quickly, and one that requires the same policy and engineering attention as other large-scale energy uses.

Search