What is chatbot containment rate in a contact centre?

Chatbot containment rate in a contact centre measures the percentage of automated interactions that are completed without transferring to a human agent. It is the primary KPI used to evaluate chatbot and self-service performance. The problem is that it measures the absence of transfer, not the presence of resolution. A customer who abandons the chatbot, receives incorrect information, or calls back thirty minutes later may all be recorded as successfully contained interactions.

Why is chatbot containment rate a vanity metric?

Chatbot containment rate is a vanity metric because it can improve while the underlying operational performance deteriorates. A high containment rate tells the operation how many customers did not transfer to an agent. It does not tell the operation how many customers got their issue resolved, how many abandoned in frustration, how many received incorrect information, or how many called back within a week about the same issue. All of these outcomes are recorded identically in the containment figure.

What should you measure alongside chatbot containment rate?

Three supplementary measures give a complete picture. First, repeat contact rate at customer level following a contained interaction — if customers return within seven to fourteen days, the contact was deferred rather than resolved. Second, transfer rate from automated to agent — a high transfer rate indicates the chatbot is receiving contacts it cannot resolve or that customers do not trust it to resolve. Third, completion rate within automated journeys — the proportion of customers who start an automated interaction and complete the intended action, which reveals friction and failure points that containment rate averages away.

How do you make the case for changing chatbot measurement in a contact centre?

The most effective argument is additive rather than corrective — frame it as building a fuller picture rather than replacing a metric. Pull repeat contact data for customers who interacted with the chatbot versus those who did not. If the chatbot cohort is returning at a materially higher rate within seven days, the cost per contact saving that justified the automation investment is overstated. The real saving is the cost of the contained contact minus the cost of the repeat contact it generated. That analysis reframes the conversation from whether the chatbot is working to what working actually means.

Chatbot Containment Rate Contact Centre: Why It's a Vanity Metric and What to Measure Instead

Q: How is chatbot containment rate similar to AHT in contact centres?

Both chatbot containment rate and AHT are output metrics used as proxies for resolution quality. AHT targets create an incentive to close contacts quickly regardless of whether the issue was resolved. Chatbot containment rate creates an incentive to prevent transfer to an agent regardless of whether the issue was resolved. In both cases the metric rewards the appearance of resolution rather than resolution itself, and both can be optimised to show positive results while repeat contacts rise and the customer experience deteriorates.

Graeme Colville
Apr 18
8 min read

Updated: May 14

Containment rate has become the primary success metric for chatbot and automated self-service deployment in contact centres.

The logic seems sound: if a customer completes their interaction in the automated channel without reaching an agent, the contact has been contained.

The operation has reduced agent load, reduced cost per contact, and demonstrated that the automation is working.

The problem is what chatbot containment rate does not measure.

It does not measure whether the customer's issue was resolved.

A contained contact and a resolved contact are not the same thing.

The CSAT measurement problem is structurally identical and explained in full in What Does CSAT Measure - And What Does It Miss?

In operations where a significant proportion of demand is structurally generated, the difference between those two measures is the difference between a metric that looks like success and an operation that is quietly getting worse.

This is one of the clearest measurement-level explanations of why AI is not reducing contact centre volume in operations that report strong containment figures.

Chatbot Containment Rate Is the New AHT

The structural problem with chatbot containment rate in a contact centre context is identical to the structural problem with AHT targets.

Both are output metrics being managed as proxies for resolution quality.

Both can be optimised to show positive results while the underlying operational performance deteriorates.

AHT targets are met when agents close contacts faster.

Whether the contact was resolved or truncated is not visible in the metric.

Contacts that were truncated - where the agent ended the interaction before the underlying issue was addressed - generate repeat demand.

The AHT metric improves.

The repeat contact rate rises.

The operation does not connect the two because the measurement framework is not designed to show it.

Chatbot containment rate follows the same logic.

The metric is met when customers complete an automated journey without reaching an agent.

Whether the journey resolved the customer's issue is not visible in the metric.

A customer who navigates a chatbot interaction, receives information that does not match their query, and abandons the interaction has been contained.

A customer who completes an automated self-service flow, finds the outcome incorrect, and calls back thirty minutes later may be recorded as a successfully contained interaction followed by an unrelated inbound contact.

The parallel goes further.

AHT targets create an incentive to close contacts quickly regardless of outcome.

Chatbot containment rate creates an incentive to prevent transfer to an agent regardless of outcome.

In both cases, the metric is rewarding the wrong behaviour - not resolution, but the appearance of resolution.

Operations that have spent years arguing against AHT as a proxy for performance are often deploying chatbot containment rate without recognising that they are making the same structural error in a different channel.

The metric improves.

The experience deteriorates.

The operation reports success.

What Contained Actually Means in a Contact Centre in Practice

In most chatbot implementations, a contact is recorded as contained when the customer does not transfer to an agent.

This definition captures a wide range of outcomes that have nothing to do with resolution.

The customer who received the information they needed and completed their intended action is contained.

The customer who could not find what they needed and abandoned the interaction is also contained.

The customer who found the chatbot unhelpful, gave up, and called back on a different channel is contained in the chatbot data and appears as a new inbound contact in the voice data.

The customer who received incorrect information and did not realise it until a downstream consequence emerged is contained.

All of these outcomes are reported identically in the containment metric.

The operation cannot distinguish a resolved contact from an abandoned one, a successful self-service from a frustrated departure, without supplementary data that most operations are not collecting.

Consider what this means in a high-volume contact centre running a chatbot against its top contact types.

A policy query chatbot is reporting 72% containment.

On the surface that is a strong result. But if 30% of those contained contacts are customers who received an answer that did not match their query and abandoned without escalating, the effective resolution rate is materially lower than 72%.

If a further 15% of contained customers call back within seven days about the same issue, the chatbot has not resolved those contacts.

It has deferred them.

A second scenario makes the problem sharper.

A claims status chatbot is reporting 68% containment.

The operation considers this underperforming relative to the policy chatbot.

But when repeat contact data is pulled for the two cohorts, the claims chatbot customers are returning at a rate of 12% within seven days, while the policy chatbot customers are returning at 31%.

The claims chatbot, despite lower containment, is resolving more contacts.

The policy chatbot, despite higher containment, is deferring a third of its interactions back into the voice queue.

The containment rate is ranking them in the wrong order.

The operation has been investing in optimising the wrong chatbot.

This is not a technology problem.

It is a measurement design problem - the same problem that allows CSAT scores to remain stable while complaints rise, because satisfaction is measured at the point of interaction and resolution is measured nowhere.

Diagram showing the four distinct customer outcomes that chatbot containment rate in a contact centre records identically - issue resolved, abandoned, incorrect information given, and deferred with a callback - all collapsing into a single contained figure, with the three supplementary measures that reveal what the containment rate cannot: repeat contact rate, transfer rate, and journey completion rate. — *Containment rate counts the contacts that did not transfer to an agent. It cannot tell you which of these four outcomes your customers actually experienced.*

What Chatbot Containment Rate Should Be Measured Alongside: Contact Center Sentiment Analysis and Repeat Contact Rate

Contact center sentiment analysis and repeat contact data are the two measures that reveal what containment rate cannot - whether the customer's issue was actually resolved.

Chatbot containment rate in a contact centre is not a useless metric.

It becomes useful when it is one of several measures rather than the primary success indicator.

Three supplementary measures change what the data is telling the operation.

Repeat contact rate at customer level following a contained interaction is the most important.

If customers who completed an automated interaction return to the contact centre within seven to fourteen days, the interaction was not resolved - it was contained and incomplete.

Tracking this cohort separately from customers who did not interact with automation shows whether the automated channel is genuinely resolving demand or deferring it.

An operation running this analysis will often find that the repeat contact rate for post-chatbot customers is higher than for customers who reached an agent first time, which inverts the efficiency narrative the containment rate was supporting.

The full framework for measuring this is set out in repeat demand metrics - the measurement approach that connects automated channel performance to contact volume outcomes.

Transfer rate from automated to agent is the second.

A chatbot interaction that ends in agent transfer is not a contained contact. It is a routing step with extra friction.

High transfer rates from automated channels indicate either that the automated journey cannot resolve the contact types it is receiving, or that customers are not trusting the automation to resolve their issue and are opting for human interaction regardless.

Both findings require a different response than optimising the chatbot design.

The first requires a demand classification exercise - the chatbot may be receiving contacts it was never going to resolve.

The second requires a trust audit - customers may have learned from experience that the chatbot does not help.

Completion rate within automated journeys - the proportion of customers who start an automated interaction and complete the intended action - shows whether the journey is functional at a more granular level than containment rate allows.

Abandonment within the automated journey is a warning signal that something in the design or the scope of automation is creating friction rather than resolution.

AI triage compounds this friction further - Contact Centre AI Triage: What It Does to Your Escalation Rate explains how the two failure modes stack on top of each other.

An operation tracking completion rate by contact type can identify exactly which journeys are working and which are failing, rather than averaging the performance across the entire chatbot into a single containment figure that obscures the variation.

Taken together, these three measures - repeat contact rate, transfer rate, and completion rate - give the operation a view of the automated channel that containment rate alone cannot provide.

The operation can see not just how many contacts were contained, but how many were resolved, how many were deferred, how many were frustrated departures, and how many were contacts the automation was never equipped to handle.

Making the Case for Changing the Measurement Framework

A call center quality assurance metrics framework that includes repeat contact rate, transfer rate, and completion rate tells the operation something containment rate cannot: whether automation is resolving demand or deferring it.

Changing the primary metric for chatbot performance is not a technical challenge.

It is a political one. Containment rate is typically owned by a specific team - digital, automation, or technology - and reported to a leadership audience that has been using it as a benchmark for return on investment.

Proposing to change or supplement it raises a question that carries risk: if containment rate is not the right measure, what does that say about the investment decisions that were made using it?

The argument that lands most effectively is not that containment rate is wrong.

Contact center modernization programmes that introduced chatbot containment rate as the primary KPI did so without the call monitoring and call center quality monitoring infrastructure needed to measure what happens after containment. The data gap is not accidental - it is a product of measurement frameworks designed around operational cost rather than customer resolution.

It is that containment rate is incomplete.

The framing is additive rather than corrective: we are not removing a metric, we are building a fuller picture.

The existing containment figures are not invalidated - they remain the denominator.

What changes is the numerator: instead of measuring contacts that did not transfer, the operation begins measuring contacts that did not return.

The business case writes itself once the repeat contact data is pulled.

If the chatbot is deferring a material proportion of contacts back into the voice queue within seven days, the cost per contact calculation that justified the automation investment is overstated.

The saving is not the cost of the contained contact.

It is the cost of the contained contact minus the cost of the repeat contact it generated.

In operations where the repeat rate is high, the net saving is significantly lower than the containment rate implies - and in some cases the automation is generating more cost than it is removing once the full contact lifecycle is costed.

That analysis, presented alongside the existing containment figures, reframes the conversation.

The question is no longer whether the chatbot is working. It is what working actually means - and whether the current measurement framework is capable of answering it.

The Question to Ask Before Reporting Containment Rate

Before presenting chatbot containment rate contact centre results as a success metric, one question cuts through the ambiguity: of the contacts recorded as contained, how many did not return?

If the operation cannot answer that question - if repeat contact data is not being tracked at customer level against automated channel interactions - the containment rate is measuring activity, not outcomes.

It is counting the contacts that did not transfer to an agent, without knowing whether those customers got what they came for.

That is the definition of a vanity metric. It looks like performance. It does not measure it.

If your chatbot containment rate is strong but contact volume is not falling, there are two places to look.

The first is demand classification - if the contacts being contained are failure demand, the automation is processing avoidable volume rather than eliminating it, and the AHT Loop intervention identifies the structural source and tests a fix before the next optimisation cycle begins.

The second is your measurement framework - if CSAT is stable while complaints are rising in parallel with your automation rollout, the Sentiment Gap intervention identifies the gap between what your metrics are reporting and what your customers are experiencing. Containment rate looks like performance. These two interventions establish whether it actually is.

Not sure whether containment rate is masking a demand problem or a measurement gap? The contact centre performance scorecard identifies which structural failure is dominant in your operation in under five minutes.