How to Set an AI Budget in 2026: A Guide for Engineering and Finance
For most finance teams, AI spend is still a mystery. They can see the AWS bill and the SaaS subscriptions, but the AI component is scattered: some model API costs buried in engineering's cloud budget, some AI SaaS tools under software line items, some GPU time under R&D. Nobody has a clean number for "what we spend on AI," which makes planning next year's budget a guessing exercise.
This is fixable, and fixing it matters more in 2026 than it did two years ago because AI costs have scaled with AI adoption. A company that spent $80,000 on AI tools in 2024 might be spending $600,000 in 2026, and if finance can't see that number clearly, they can't make good decisions about where to invest or cut.
Here's how organizations that do this well structure their AI budgets.
First: get a real number for what you're spending now
Before planning next year's budget, you need to know what you're spending today. This sounds obvious, but many companies genuinely don't know. AI spend tends to live in several places at once.
Direct API costs. OpenAI, Anthropic, Google, Cohere, and similar provider invoices. These might be paid by engineering, IT, or individual business units on their own credit cards. Pull all of them.
AI SaaS subscriptions. Tools like Notion AI, GitHub Copilot, Jasper, Writer, and dozens of others. These often appear as separate SaaS line items, not as "AI." Do a pass through your software subscriptions looking for tools with an AI component.
Cloud compute for AI. If you're running any self-hosted models, fine-tuning jobs, or inference endpoints on AWS, GCP, or Azure, those costs are buried in your cloud bill. Most cloud providers have cost tags or service breakdowns that let you filter for GPU-intensive services.
Consulting and implementation. Any external contractors helping with AI integration or prompting work.
Add these up. You probably have a number that's larger than anyone on the executive team expected.
Capex vs opex for AI
Traditional software has a clear capex/opex split: buying licenses or hardware is capex, paying subscriptions or cloud is opex. AI blurs this because model training, fine-tuning, and infrastructure build costs are significant one-time investments, while API usage is pure opex.
The practical implications for budget planning:
API usage is pure opex. Calls to hosted AI APIs are operational expense. They scale with usage, appear on your P&L, and should be budgeted by forecast rather than as a fixed annual commitment.
Fine-tuning is capex-adjacent. The cost to train or fine-tune a model is a one-time investment that produces a durable asset (the fine-tuned model). Many finance teams treat this as an R&D expense with the option to capitalize it as an intangible asset. Your accounting team should decide based on whether the model meets capitalization criteria (future economic benefits are probable, cost can be reliably measured, technical feasibility is established).
GPU hardware is classic capex. If your organization is buying or financing physical GPU hardware for inference, that's capital expenditure with depreciation schedules. This decision vs. paying for cloud GPU time is a classic rent-vs-own analysis: cloud is more expensive per compute-hour but more flexible, owned hardware is cheaper per hour at scale but requires upfront investment and operational overhead.
For most organizations below about $5 million in annual AI compute spend, cloud APIs and cloud GPU rentals are the right choice. Above that level, the rent-vs-own math starts to favor owned infrastructure for predictable workloads.
Per-team allocation models
The choice between central AI budgeting and per-team AI budgeting has real tradeoffs. Here's how different organization types approach it.
Central pool model: All AI spend goes through a single budget managed by engineering or a central AI team. Business units request access to AI resources and the central team provides them. This model avoids duplication (you don't end up with five teams each paying for their own OpenAI subscriptions), makes it easy to negotiate enterprise contracts, and gives you a clear view of total spend. The downside is that it creates a bottleneck and makes it harder for individual teams to experiment quickly.
Departmental allocation model: Each department gets an AI budget allocation at the start of the year, like a software budget. They can spend it on AI tools and API access within a set of approved vendors. This gives teams more autonomy and accountability. The downside is that you lose central visibility and negotiating power, and teams may duplicate tools unnecessarily.
Hybrid model (common at companies 500+ employees): A central contract and billing relationship for shared infrastructure (the main AI API keys, shared AI SaaS licenses, GPU cluster), with per-department allocations carved from that central agreement. Departments have their own quotas and chargebacks, but the commercial relationship is managed centrally. This is the model that scales best.
Chargeback: making AI costs visible to the teams using them
A chargeback model means that when a product team uses AI infrastructure, the cost is allocated back to their budget rather than absorbed by a central engineering budget. Even if the actual cash movement is just an internal accounting entry, making teams see the cost of their AI usage changes behavior significantly.
Teams that see AI costs charged to their budget tend to:
- Use cheaper models for tasks where quality differences are minimal
- Implement caching to avoid redundant API calls
- Actually track ROI rather than treating AI as a free resource
- Push back on AI feature ideas that don't have a clear business case
The mechanics of a chargeback model:
Tag every request. Every AI API call should be tagged with a department code, team, and ideally a product/feature identifier. LiteLLM's proxy server, Helicone, and Langfuse all support request tagging and can generate per-tag cost reports.
Set monthly caps. Each team should have a monthly cap that triggers an alert (not a hard block) when they're at 80% of budget. Hard blocks are disruptive; soft alerts allow teams to decide whether to pull forward next month's budget or reduce usage.
Settle monthly. Run a monthly report showing actual AI spend per team. Share it with team leads. Even if the money doesn't actually move (pure accounting allocation), the visibility changes behavior.
A common objection: tagging every request adds engineering overhead. It does, but the overhead is small, typically a single metadata field on each API call, and the visibility it provides is worth it. The alternative is having no idea which features are expensive and which aren't.
Budget planning by maturity stage
Year 1-2 of AI adoption: Budget primarily for experimentation. Allocate a fixed percentage of engineering headcount cost (typically 5-8%) to AI tools, API credits, and experimentation time. Don't try to build complex chargeback models yet; focus on getting teams experimenting and producing learnings.
Year 2-3: AI is in production in at least a few places. Shift to a more detailed budget with specific allocations for production AI workloads (these should be budgeted against the business cases that justified them), exploration budget, and education/tooling budget. Start building chargeback visibility even if not enforcing it.
Year 3+: AI is significant operational infrastructure. The budget looks more like cloud infrastructure budgeting: a combination of committed spend (for predictable production workloads), variable capacity (for scaling), and R&D allocation. Finance should have a dedicated AI cost category. Annual planning should include AI cost per unit of output for major workloads.
Real numbers from organizations doing this well
A B2B SaaS company with about 200 employees: $340,000 in AI spend in 2025, split roughly 40% on GitHub Copilot and coding tools (from engineering budget), 35% on OpenAI API for their product features (from product budget, charged back to individual feature teams), 15% on AI writing tools (from marketing), and 10% on internal experimentation.
A financial services firm with 1,200 employees: centralized AI budget of $2.1 million, managed by a dedicated AI platform team. Business units submit requests for AI projects with business cases; the platform team allocates resources and charges back actual usage quarterly. Total cost visibility for CFO, with per-project ROI tracking mandatory for any project over $50,000.
A 40-person startup: no formal AI budget process yet. AI API spend averages $12,000 per month, paid from a shared engineering AWS account. They know roughly what they're spending but have no per-feature attribution. Planning to add request tagging and per-feature dashboards in the next quarter.
The forecast problem: AI costs are hard to predict
Unlike SaaS subscriptions, API-based AI costs are usage-based and can grow or shrink unpredictably as product features ship, as user adoption grows, or as the underlying model changes price. This makes forecasting harder.
Practical approaches to forecasting API costs:
Per-unit model: Estimate cost per business transaction that involves AI (cost per support ticket handled, cost per document processed, cost per user session). Multiply by projected transaction volume. This gives you a cost that scales with the business forecast, which finance teams prefer.
Capacity planning with buffer: Estimate peak daily volume, calculate daily API cost at that volume, and multiply by days in the period plus 20% buffer. This is conservative but prevents under-budgeting.
Historical growth rate: If you have 6+ months of AI cost data, extrapolate using historical growth rate. Apply judgment about whether that growth rate will continue, accelerate, or slow based on your product roadmap.
Most finance teams will ask for all three approaches and use the range to set a contingency. Build the contingency into your budget ask, not as a hidden reserve.
AI budgeting isn't fundamentally different from cloud infrastructure budgeting, which most finance teams have figured out over the last decade. The same principles apply: get visibility first, allocate accountability second, optimize third. The mistake is trying to optimize before you have visibility. Start with getting a real number for what you're spending today, and go from there.