Chatbot Containment Rate Contact Centre: Why It's a Vanity Metric and What to Measure Instead
- Graeme Colville
- 2 days ago
- 7 min read
Updated: 9 hours ago
Containment rate has become the primary success metric for chatbot and automated self-service deployment in contact centres.
The logic seems sound: if a customer completes their interaction in the automated channel without reaching an agent, the contact has been contained.
The operation has reduced agent load, reduced cost per contact, and demonstrated that the automation is working.
The problem is what chatbot containment rate does not measure.
It does not measure whether the customer's issue was resolved.
A contained contact and a resolved contact are not the same thing.
The CSAT measurement problem is structurally identical and explained in full in What Does CSAT Measure - And What Does It Miss?
In operations where a significant proportion of demand is structurally generated, the difference between those two measures is the difference between a metric that looks like success and an operation that is quietly getting worse.
This measurement failure is one of three structural reasons explored in Contact Centre AI Not Reducing Volume: The Structural Explanation Nobody Is Naming.
Chatbot Containment Rate Is the New AHT
The structural problem with chatbot containment rate in a contact centre context is identical to the structural problem with AHT targets.
Both are output metrics being managed as proxies for resolution quality.
Both can be optimised to show positive results while the underlying operational performance deteriorates.
AHT targets are met when agents close contacts faster.
Whether the contact was resolved or truncated is not visible in the metric.
Contacts that were truncated - where the agent ended the interaction before the underlying issue was addressed - generate repeat demand.
The AHT metric improves.
The repeat contact rate rises.
The operation does not connect the two because the measurement framework is not designed to show it.
Chatbot containment rate follows the same logic.
The metric is met when customers complete an automated journey without reaching an agent.
Whether the journey resolved the customer's issue is not visible in the metric.
A customer who navigates a chatbot interaction, receives information that does not match their query, and abandons the interaction has been contained.
A customer who completes an automated self-service flow, finds the outcome incorrect, and calls back thirty minutes later may be recorded as a successfully contained interaction followed by an unrelated inbound contact.
The parallel goes further.
AHT targets create an incentive to close contacts quickly regardless of outcome.
Chatbot containment rate creates an incentive to prevent transfer to an agent regardless of outcome.
In both cases, the metric is rewarding the wrong behaviour - not resolution, but the appearance of resolution.
Operations that have spent years arguing against AHT as a proxy for performance are often deploying chatbot containment rate without recognising that they are making the same structural error in a different channel.
The metric improves.
The experience deteriorates.
The operation reports success.
What Contained Actually Means in a Contact Centre in Practice
In most chatbot implementations, a contact is recorded as contained when the customer does not transfer to an agent.
This definition captures a wide range of outcomes that have nothing to do with resolution.
The customer who received the information they needed and completed their intended action is contained.
The customer who could not find what they needed and abandoned the interaction is also contained.
The customer who found the chatbot unhelpful, gave up, and called back on a different channel is contained in the chatbot data and appears as a new inbound contact in the voice data.
The customer who received incorrect information and did not realise it until a downstream consequence emerged is contained.
All of these outcomes are reported identically in the containment metric.
The operation cannot distinguish a resolved contact from an abandoned one, a successful self-service from a frustrated departure, without supplementary data that most operations are not collecting.
Consider what this means in a high-volume contact centre running a chatbot against its top contact types.
A policy query chatbot is reporting 72% containment.
On the surface that is a strong result. But if 30% of those contained contacts are customers who received an answer that did not match their query and abandoned without escalating, the effective resolution rate is materially lower than 72%.
If a further 15% of contained customers call back within seven days about the same issue, the chatbot has not resolved those contacts.
It has deferred them.
A second scenario makes the problem sharper.
A claims status chatbot is reporting 68% containment.
The operation considers this underperforming relative to the policy chatbot.
But when repeat contact data is pulled for the two cohorts, the claims chatbot customers are returning at a rate of 12% within seven days, while the policy chatbot customers are returning at 31%.
The claims chatbot, despite lower containment, is resolving more contacts.
The policy chatbot, despite higher containment, is deferring a third of its interactions back into the voice queue.
The containment rate is ranking them in the wrong order.
The operation has been investing in optimising the wrong chatbot.
This is not a technology problem.
It is a measurement design problem - the same problem that allows CSAT scores to remain stable while complaints rise, because satisfaction is measured at the point of interaction and resolution is measured nowhere.

What Chatbot Containment Rate Should Be Measured Alongside
Chatbot containment rate in a contact centre is not a useless metric.
It becomes useful when it is one of several measures rather than the primary success indicator.
Three supplementary measures change what the data is telling the operation.
Repeat contact rate at customer level following a contained interaction is the most important.
If customers who completed an automated interaction return to the contact centre within seven to fourteen days, the interaction was not resolved - it was contained and incomplete.
Tracking this cohort separately from customers who did not interact with automation shows whether the automated channel is genuinely resolving demand or deferring it.
An operation running this analysis will often find that the repeat contact rate for post-chatbot customers is higher than for customers who reached an agent first time, which inverts the efficiency narrative the containment rate was supporting.
Transfer rate from automated to agent is the second.
A chatbot interaction that ends in agent transfer is not a contained contact. It is a routing step with extra friction.
High transfer rates from automated channels indicate either that the automated journey cannot resolve the contact types it is receiving, or that customers are not trusting the automation to resolve their issue and are opting for human interaction regardless.
Both findings require a different response than optimising the chatbot design.
The first requires a demand classification exercise - the chatbot may be receiving contacts it was never going to resolve.
The second requires a trust audit - customers may have learned from experience that the chatbot does not help.
Completion rate within automated journeys - the proportion of customers who start an automated interaction and complete the intended action - shows whether the journey is functional at a more granular level than containment rate allows.
Abandonment within the automated journey is a warning signal that something in the design or the scope of automation is creating friction rather than resolution.
AI triage compounds this friction further - Contact Centre AI Triage: What It Does to Your Escalation Rate explains how the two failure modes stack on top of each other.
An operation tracking completion rate by contact type can identify exactly which journeys are working and which are failing, rather than averaging the performance across the entire chatbot into a single containment figure that obscures the variation.
Taken together, these three measures - repeat contact rate, transfer rate, and completion rate - give the operation a view of the automated channel that containment rate alone cannot provide.
The operation can see not just how many contacts were contained, but how many were resolved, how many were deferred, how many were frustrated departures, and how many were contacts the automation was never equipped to handle.
Making the Case for Changing the Measurement Framework
Changing the primary metric for chatbot performance is not a technical challenge.
It is a political one. Containment rate is typically owned by a specific team - digital, automation, or technology - and reported to a leadership audience that has been using it as a benchmark for return on investment.
Proposing to change or supplement it raises a question that carries risk: if containment rate is not the right measure, what does that say about the investment decisions that were made using it?
The argument that lands most effectively is not that containment rate is wrong.
It is that containment rate is incomplete.
The framing is additive rather than corrective: we are not removing a metric, we are building a fuller picture.
The existing containment figures are not invalidated - they remain the denominator.
What changes is the numerator: instead of measuring contacts that did not transfer, the operation begins measuring contacts that did not return.
The business case writes itself once the repeat contact data is pulled.
If the chatbot is deferring a material proportion of contacts back into the voice queue within seven days, the cost per contact calculation that justified the automation investment is overstated.
The saving is not the cost of the contained contact.
It is the cost of the contained contact minus the cost of the repeat contact it generated.
In operations where the repeat rate is high, the net saving is significantly lower than the containment rate implies - and in some cases the automation is generating more cost than it is removing once the full contact lifecycle is costed.
That analysis, presented alongside the existing containment figures, reframes the conversation.
The question is no longer whether the chatbot is working. It is what working actually means - and whether the current measurement framework is capable of answering it.
The Question to Ask Before Reporting Containment Rate
Before presenting chatbot containment rate contact centre results as a success metric, one question cuts through the ambiguity: of the contacts recorded as contained, how many did not return?
If the operation cannot answer that question - if repeat contact data is not being tracked at customer level against automated channel interactions - the containment rate is measuring activity, not outcomes.
It is counting the contacts that did not transfer to an agent, without knowing whether those customers got what they came for.
That is the definition of a vanity metric. It looks like performance. It does not measure it.
If your chatbot containment rate contact centre metrics are strong but contact volume is not falling, the Sentiment Gap intervention identifies whether your measurement framework is capturing the same thing your customers are experiencing - and what the gap is hiding.



Comments