Engineering · 8 min read

The customer escalation abstraction layer

Ben McCormack
Posted July 27, 2021
The customer escalation abstraction layer

It's Monday morning. You’re on point this week for resolving customer-facing bugs, so armed with your favorite caffeinated beverage—it’s coffee; your favorite caffeinated beverage is coffee—you grab the top item from the queue and get to work. After a frustrating hour where you utter “this should work” more often than you’d care to admit, you finally nail down the root cause (“oh right, this should not, in fact, work”). You commit some code and let the customer support team know a fix is on the way. Out of curiosity, you ask which customer was being affected by this bug.

“Uh.”

Someone is typing…

“Okay, so, they’re not actually a customer. They were trialing our software but didn’t convert. They haven’t signed on in weeks. This item shouldn’t have been in your queue. Really sorry.”

You glance at the next item in the queue and decide it’s best to take an early lunch.

Leverage

Let’s talk about the abstraction layer between customers and engineers, which contains systems and processes that fundamentally affect the productivity of everyone involved.

Product companies design software to support customers at scale. As much as possible, our intention is for customers to succeed without needing to reach out to a human. The software is intuitive and usable. It just works.

But sometimes the software is a bit confusing, or there’s a bug, so we have systems like customer support teams and online documentation to help unblock customers and get them on their way.

If everything is working well, engineering teams can focus on designing solutions that serve all customers, not just the needs of one particular customer. If you find yourself drowning in the problems of individual customers, that can be a sign of an unhealthy organization, namely, that your product doesn’t support customers at scale. Conversely, it can also be a sign of success at scale—perhaps you’ve acquired so many customers that a very small percentage of them have problems. Behind that small percentage are discrete individual customers who need the attention of the product engineering team.

The process for escalating customer issues to engineering teams contains a great deal of leverage. We saw an example of negative leverage at the beginning of this post. Poor processes can lead to queues of nebulous importance, which increase the chances of working on the wrong thing. Not fun.

Fullstory’s support engineering and product engineering teams worked together over the past year to overhaul our customer escalation process to provide positive leverage, improving how we stay in sync, standardizing how we escalate issues, and automating processes to be more bionic. What follows is our recommendation for improving your customer escalation abstraction layer.

Get in sync

At Fullstory, we tend to be fairly averse to standing meetings, preferring to communicate asynchronously as much as possible. In the case of customer escalations, where the process necessarily crosses team boundaries, a weekly meeting adds a great deal of leverage.

My advice: don’t overthink it. Put a recurring event on the calendar to review the customer escalation backlog. Keep a running agenda and spend the time talking about what is directly affecting customers. You’ll find that the process of connecting weekly immediately increases the level of empathy and clarity in communication between teams.

Why does the meeting add so much leverage? When a company is small, teams don’t need as much structure and process—you can stay in sync just by working closely on the same set of problems every day.  As companies and teams grow, the systems to communicate across team boundaries aren’t as robust as the internal systems that emerge from within teams. Meetings are a way to immediately increase the flow of information between teams while the processes and systems that connect different teams mature and develop.

Standardize the Escalation Template

Here are two ways to escalate the same issue:

What’s going on here? Is this a bug?

And:

Take a look at the reproduction steps below. The error at the end is unexpected. Can you determine if the error is caused by our code or the customer’s code? The issue is relatively low priority for the customer, but they would appreciate knowing if the error is on their end. If the error is on our end, we can add the issue to our backlog, but it wouldn’t need to be fixed immediately.

The difference is stark. Engineers risk wasting time working on customer issues (and non-issues) when it’s unclear what is being asked of them. This wastes time and valuable engineering resources. By standardizing the escalation template, engineers can narrow their focus on the specific problem at hand, increasing the amount of time available to serve other customers.

We’ve broken down our escalation template into several sections:

What is our clear ask for engineering?

This is the most important part—in one or two sentences, summarize what support is asking from engineering. An engineer should be able to read this section and immediately know what’s being asked of them.

What’s happened so far?

Describe in plain language what the customer has experienced, referencing existing internal and external communication that may be in play. This is where you link out to support tickets and Slack threads. For complex issues, it’s extremely helpful to create a paper trail of internal and customer-facing conversations that have already happened.

Steps to Reproduce the Problem

Sometimes we can just include a Fullstory session link, but for more complex issues, it helps to spell out the exact reproduction steps along with Expected vs Actual outcomes.

Who is the customer?

Provide the customer’s name and some signal about their relationship with you. We use the customer’s annual recurring revenue (ARR) dollar value as a proxy for customer size. This humanizes the escalation and reminds engineering that the issue is affecting a person or team at a real company and isn’t merely a technical problem to be solved.

What is the business impact?

Describe in plain language how the issue is impacting the customer. This provides an additional opportunity for empathy as well as a signal that can help with prioritization when viewing escalations collectively.

With highly technical escalations, it’s not always clear what engineering needs to do. Being consistently clear in this area can go a long way to ensure engineers aren’t wasting cycles on unnecessary work and customers are getting the answers they need.

Bionics - put the system into action

Setting up recurring meetings and improving the quality and consistency of our escalations are the highest leverage changes you can make to improve the escalation process. These two changes will get you 80% of the way there. If you’d like to go the extra mile, you can start automating the systems that have begun adding leverage.

Automate the escalation process. Most help desk solutions provide a way to create macros, snippets, or saved replies. You can use these structured text “forms” as the start of the escalation process. By using the APIs between your help desk and bug tracking system—we leaned heavily on Zapier to connect Zendesk and Clubhouse—you can ensure information is consistent and processes are efficient.

Highlight important information. With automation, you get to choose which information stands out. For us, that meant including the customer name and ARR directly in the title of the escalation. ARR isn’t the only driver of prioritization—we build a product that scales to serve all customers—but it does give us an approximation for customer size. For complex issues that affect multiple customers, we generate an HTML table of all customers and their ARR, summing up the total number of customers and ARR and including that in the title of the escalation.

Structure the escalation backlog to serve as a meeting agenda. You’ll find that when your escalation process is consistent and important information stands out, it’s much easier to identify what’s important and needs to be prioritized. This naturally leads to your backlog being able to serve as a meeting agenda. As a bonus, you can start leaving meeting notes directly in the escalations themselves, eliminating the need to keep track of escalation notes in multiple places.

One backlog to rule them all

Improving the escalation process between support engineers and product engineers creates a healthy gravitational pull whereby you start wishing escalations from other teams followed the same process.

For example, sales deals are often driven by a high sense of urgency, which usually means communication happens quickly over Slack, frequently lacking the level of detail that you would find in a standardized escalation workflow. This can be okay—there’s a benefit to allowing for a healthy dose of urgency—but these drive-by escalations also have a tendency to fall through the cracks as threads become inactive and people move on to the next urgent-but-maybe-not-important task. For issues that can’t be immediately resolved, support engineers can route these conversations through the standard escalation process to make sure engineers have the information they need to tackle and prioritize issues on a longer time horizon. This ensures issues don't fall through the cracks as our collective Slack memory fades when threads become inactive.

Every successful product and engineering organization is going to face the challenge of prioritizing individual customer escalations. By improving the processes that serve as the interface between customers and engineering teams, customers get the information they need faster and engineers can spend more time working on solutions that scale to serve all customers.

Want a perfect website or app? Fullstory can help. Request a demo today.

Author
Ben McCormackTechnical Program Manager

About the author

Ben McCormack is the Lead Support Engineer turned Technical Program Manager at Fullstory. He's based in Atlanta, GA.

Return to top

Related posts

Blog Post
Building a resilient organization: Strategies for risk mitigation and compliance considerations

Master risk management and compliance for robust organizational resilience and data protection.

Read the post
Blog Post
Fullstory’s journey to safer client data with Semgrep

Discover how Fullstory uses Semgrep for advanced static code analysis to enhance client data security.

Read the post
Blog Post
Fullstory’s guide to protecting behavioral data and user privacy

Explore best practices for handling behavioral data and PII, ensuring privacy and security while unlocking valuable insights.

Read the post