Program Leadership

How to Scope an AI Program Budget at a Bank

The pilot was budgeted at $400,000. By month eight, the ask was $1.8 million and nobody could explain why the original number was so wrong. The CFO was not pleased. The program lead was on the way out. And the initiative — which had genuinely promising results from the pilot — was close to being cancelled not because it didn't work but because nobody had scoped it honestly from the start.

Budget failure is one of the most consistent causes of stalled AI programs at financial institutions, and it is almost entirely avoidable. The programs that run out of money six months in did not encounter bad luck. They were scoped around the part of the work that everyone could see — the model, the vendor license, the data science team — and omitted the parts that are harder to scope but typically cost more.

This article is about how to build a budget that reflects what an AI program at a regulated financial institution actually costs, and how to present that number to a CFO who may be anchored to a much smaller figure.

The five buckets that belong in every AI program budget

Model and vendor costs. This is the number that appears in most initial budgets and almost nothing else. The vendor license, the API costs, the data science team time spent building and tuning the model. For a vendor-supplied AI system, this might be a per-seat or per-transaction license fee. For an internally developed model, it includes the engineering and data science labor to build it.

This is typically the smallest of the five buckets — and the only one that gets properly scoped in the initial business case. A $300,000 vendor license feels like the cost of the program. It is not. It is the cost of the model. The model is the smallest part of the work.

Integration engineering. Getting an AI system from a working model to a production deployment requires a live data pipeline, authentication and authorization, logging and audit trails, error handling, monitoring infrastructure, and deployment tooling that can be patched, updated, and rolled back. None of this was in the pilot. The pilot ran on exported data in a notebook. Production runs on live systems with real dependencies and real consequences when something breaks.

Integration engineering typically costs three to five times the model cost. A $300,000 model or vendor license should be paired with an integration budget of $900,000 to $1.5 million. This number lands badly in most initial budget conversations, which is why it is often omitted — and why programs run out of money in month six when the integration work turns out to cost exactly what it was always going to cost.

The multiplier varies by use case and by institution. An institution with modern, well-documented APIs and a mature data engineering team can integrate faster and cheaper than one running on legacy core systems with limited documentation. Knowing where your institution sits on that spectrum before finalizing the budget is part of the scoping work.

Model Risk Management and governance. Getting an AI system through MRM validation at a regulated financial institution takes time and internal resources. The data science team spends time preparing documentation, answering questions, and attending review meetings. The MRM team spends time reviewing. Legal reviews vendor agreements. Compliance assesses fair lending implications. None of this is free, and none of it appears in most initial AI program budgets.

For a straightforward model at a well-prepared institution, governance work might add 10 to 15 percent to the total program cost. For a novel system type at an institution with an MRM team that has limited AI experience, it can add considerably more — not because anyone is being obstructionist, but because the work is genuinely complex and the questions are genuinely hard. Budget for it explicitly, with a timeline assumption, and identify the internal resources who will be doing it.

Change management and training. An AI system that affects how people do their work requires those people to change how they do their work. Training the operations team. Redesigning workflows to incorporate the system's output. Communicating to staff why the system exists and what it does. Managing the transition period when the new system and the old process coexist. Building feedback mechanisms so users can flag problems and the system can improve.

This work is described in almost every AI program plan and funded in almost none of them. The assumption is that training can be handled by the project team, that workflow redesign will happen naturally, and that adoption will follow deployment. It doesn't. The change management gap is one of the most reliable predictors of low utilization — of technically functional systems that produce the 23% usage metric that nobody can explain.

Change management for an AI deployment affecting a significant operational team should be budgeted as a distinct workstream with a named owner, a plan, and real resources. For a program affecting 50 to 200 operational staff, this might be $150,000 to $400,000 depending on the complexity of the workflow change and the sophistication required in the training.

Contingency. Ten percent contingency is standard in technology program budgets. It is not enough for a first AI program at a financial institution. The unknowns are larger, the dependencies are more complex, and the institutional processes — MRM, IT security, legal, procurement — take longer than planned with a regularity that suggests the standard timeline assumptions are systematically optimistic.

For a first AI program, budget 20 to 25 percent contingency. For a second or third program at an institution that has learned its own processes, 15 percent may be appropriate. The contingency is not expected to be spent. It is insurance against the specific uncertainty of navigating institutional processes that most teams have not navigated before at this level of complexity.

The line items that always get cut and always cause problems

When an AI program budget goes through finance review, certain line items attract scrutiny. The people cutting them are doing their jobs — they are applying standard cost discipline to a budget they understand imperfectly. The result is a budget that looks tighter and more defensible but that produces exactly the cost overrun it was designed to prevent.

The data engineering work that "should already be in place." Whoever reviews the budget will observe that the institution already has a data engineering team, that the data infrastructure already exists, that this work should not require a separate budget line. The observation is technically correct and operationally wrong. The data engineering team has existing commitments. The data infrastructure exists but was not designed for real-time AI inference. The work required is real work, and if it isn't budgeted, it either doesn't happen or it happens at the expense of something else that was supposed to happen.

The ongoing monitoring budget. After the system is deployed, someone has to monitor it. The monitoring plan requires tooling, regular review, and a defined response process when something triggers. This is recurring cost, not one-time project cost, and it belongs in the budget as a line item. It is frequently cut or deferred to an operational budget that hasn't been created yet — which means the monitoring doesn't happen, and the system drifts without anyone noticing.

The dedicated program management resource. AI programs at financial institutions require someone whose job is to run the program — managing the integration between data science, IT, MRM, the business unit, and the vendor. This is not the CTO's job. It is not the data scientist's job. It is a distinct role that requires distinct time. Programs that don't budget for it discover that it gets done inadequately by people doing it in addition to their primary responsibilities, and it produces the governance gaps and integration delays that stall programs consistently.

How to build a phased budget with go/no-go checkpoints

A phased budget is considerably easier to approve than a full program budget, and it is also more honest — because it acknowledges that the cost estimates for later phases are less certain than the estimates for earlier ones.

Phase one covers the diagnostic and planning work: validating the use case against production data, completing the MRM pre-engagement conversation, scoping the integration work with the actual IT team rather than in the abstract, and producing a fully-loaded budget for phases two and three. Phase one is relatively cheap — typically $150,000 to $300,000 — and it produces the information needed to finalize the rest of the budget with confidence rather than optimism.

The go/no-go at the end of phase one is real: if the production data validation shows the model won't perform as expected, if MRM identifies a blocker that will take twelve months to resolve, if the integration scope turns out to be twice what was assumed, the institution has spent a modest amount to discover that before committing to the full program cost. That is the function of a phase one budget, and it is a function that most institutions skip by approving the full program budget based on pilot assumptions that have never been tested against production reality.

Presenting a realistic number to a CFO anchored to the pilot cost

The hardest conversation in AI program budgeting is the one where the pilot cost $400,000 and the production program costs $1.8 million. The CFO has an anchor. The anchor is wrong, but it is real, and presenting a number that is 4.5 times the anchor without preparation will produce a difficult meeting.

The preparation is to explain the multiplier before presenting the number. Describe what a pilot is — a model running on curated data in a controlled environment, with no integration, no audit trail, no monitoring, no change management, no governance review. Describe what production is — a live system running on real data, integrated with real systems, validated by MRM, monitored continuously, adopted by an operational team. The gap between those two things is why the number is what it is.

Then present the number in the format a CFO can evaluate: total cost with a clear breakdown by bucket, a phased structure with go/no-go checkpoints, a defined exit cost if the program is terminated at phase one, and named owners for each phase. That presentation does not make the number smaller. It makes it defensible — which is what gets it approved.

The programs that run out of budget were not underfunded. They were misfunded — the money went to the part of the work everyone could see and not to the parts that actually determine whether the program reaches production. Saying the real number out loud, with a credible breakdown, is almost always the thing that unblocks the budget conversation.

Working on a budget for an AI program, or trying to explain why the original number was wrong? I'm glad to help think through what the real scope looks like.

Email me