Navigate Incident Management Like a Pro: MyFitnessPal's Sr. Director of Engineering Shares Insider Strategies with Lee Atchison
How much time are engineering teams spending on incidents?
Are you trying to set your engineering team free to do their best work? Read our new case study to learn how Blameless can help you do that.

Blameless CommsAssist - 3 Tips on Making Incident Communication Easy

Emily Arnott
|
1.25.2024

When you’re in the thick of an incident, communication is both essential and challenging. A wide variety of stakeholders will need timely updates on the situation in order to respond effectively. At the same time, breaking away from the actual diagnostic and resolving work to send these updates can massively slow progress.

With that in mind, we built CommsFlow, a Blameless feature that streamlines and automates communication, keeping it in the workflow you’re already using for resolving the incident. All you do is draft a message and pre-built rules determine when and where it’s sent. But what about when you don’t even have time to draft a message? That’s where CommsAssist comes in.

CommsAssist is our revolutionary new AI-powered feature that automatically builds out messages for different audiences. Just input the essential facts of the situation that need to be communicated, choose your audience, and you’re done! Back to resolving the incident!

Check out more details on how it all works in our release post, plus all the cool new features in our first update. Once you’re up to speed, read on for three tips on how to get the most out of CommsAssist and make incident communication easier than ever before.

1. Use CommsAssist to build up your incident communication process

In the recent 2024 Catchpoint SRE report, “escalating to, or coordinating between, responsible parties” was the 2nd most common response to “which parts of recent incidents were the most difficult?” – more commonly cited than actually fixing or detecting the problem! Addressing the challenges of incident communication requires more than just an ad-hoc reactive approach. You need to proactively build up a robust process to stop this from being an impediment.

Luckily, CommsAssist provides you the perfect opportunity! Start by identifying all the different types of stakeholders that may need to know the status of an incident. These could include:

  1. Other on-call teams that may need to be called in to escalate
  2. Other engineering teams that may need to know about outages of services they depend on
  3. Managers that may need to authorize other escalations or resource usage
  4. Executives that may need a high level overview of the resolution timeline
  5. Customer success teams that may need to update customers or customer-facing status pages
  6. Legal teams that may need to review the incident in the context of service level agreements
  7. Sales teams that may need incident timelines in order to manage prospective customers’ expectations

You can be as specific as you need, down to individual roles if needed – if there’s a particular executive that needs to know about incidents in a certain product area, you can include that as a separate audience.

Once you have your list, think about when each of these audiences would need a status update. Is it when certain tags are added to incidents, signifying that they relate to a specific product area or user base? Is it when incidents increase in severity? Is it when the incident changes status? Is it every fixed time interval? Build a table to capture these conditions for each audience.

Finally, complete your table by including the large language model prompt that should accompany your message. Use the default prompts in CommsAssist as a starting point, then tailor them based on what each audience would need to focus on. Make sure to do some trial messages to make sure it’s enhanced in the way you expect.

This exercise will build up CommsAssist into a one-stop communication powerhouse, capable of handling the vast majority of necessary updates during an incident. But that isn’t all you’ve accomplished – by mapping out these audiences and their requirements, you’ve formalized and explored these key relationships during incidents. This is a critical step in moving from chaotic, unproductive incident coordination to a smooth and efficient process.

2. When it comes to writing messages, keep it snappy

The beauty of CommsAssist is that it fills in all the gaps of your update message – in grammar, context, politeness, and more! Maximize this advantage by minimizing the work you put into the original message. Trust in the AI’s ability to do a lot with a little.

For example, instead of bothering to write out “we’re experiencing an error with our deployment service meaning we can’t push service updates. Looking into potential causes. Our best guess is that this will be resolved within 3 hours.”, try just putting “deployment service down. No service updates now. Eta for fix 3 hours probably.” You can always refine the CommsAssist suggestion if it’s missing a necessary detail.

The time difference between typing out a short and long message isn’t much, but the cognitive benefit you get from not having to “code switch” back to “full English” is huge. You can go even further by copy-pasting incident details from monitoring tools that you’re currently referencing, such as when the incident was detected or the state of resources when it occurred. CommsAssist will work out based on your prompt what details are relevant and how to convey them.

3. Use channel messages for opt-in communications

One of the most powerful aspects of CommsFlow and CommsAssist is the wide range of channels it can use to communicate – email, Slack, MS Teams, phone SMS, and more. And beyond just individual direct messages, can send your messages straight to a designated Slack or MS Teams channel. This is useful for updating specific groups, but can also be used to build a simple and effective opt-in status channel.

Create a new Slack or MS Teams channel called something like #incident-updates, and a new audience in CommsAssist called something like “Incident Channel”. Make the prompt something like “The audience is members of a Slack channel that are looking for updates on incidents that they aren’t directly involved in. They want to know the general status of ongoing incidents but don’t need technical details. Their main focus is what services are unavailable and how long it will take to restore. Make sure not to include any sensitive information.” Make sure to scan the output message before sending to ensure no sensitive information was accidentally included.

This will allow you to keep internal audiences up to date on incidents quickly and without overloading their other communication channels with unactionable information. Positioning it in a Slack or Teams channel will allow them to comment and discuss each update in threads. Having a centralized location for incident updates across teams and product areas will give everyone a better sense of system health.

Do it all with Blameless, today!

CommsAssist will continue to evolve to make incident communication even easier. Get in on the ground floor and ride the cutting edge by checking out Blameless today! We offer a robust incident workflow that guides each responder through diagnosis and resolution without redundant work. After the incident, learn how to become more resilient with our retrospectives and reliability patterns.

See it all in action by signing up for a demo.

Resources
Book a blameless demo
To view the calendar in full page view, click here.