Navigate Incident Management Like a Pro: MyFitnessPal's Sr. Director of Engineering Shares Insider Strategies with Lee Atchison
How much time are engineering teams spending on incidents?
Are you trying to set your engineering team free to do their best work? Read our new case study to learn how Blameless can help you do that.

Incident Communication (CommsFlow) Messaging Templates | Blameless

Emily Arnott
|
2.14.2023

Effective communication is critical during incidents. In order to minimize the impact of an incident and resolve it quickly, it's important that all stakeholders are kept informed and updated throughout the incident response process. However, communicating during an incident can be challenging, especially when dealing with multiple stakeholders and a high level of stress. On-call engineers can have their focus disrupted by switching out of their diagnostic tools to issue communications. Getting stakeholders up to speed can be a major time drain.

Blameless CommsFlow is our answer to this challenge. With CommsFlow(™), you can set up templated messages with specific recipients that send automatically when the incident moves through different statuses. For example, you can have an incident communication template that automatically alerts management whenever a new incident is declared, automatically filling itself out with the information of that particular incident. Let’s look at other examples of helpful templates.

Automated Status Updates

One example of a helpful messaging template is an automated status update. This template can be set up to automatically send updates to all stakeholders, such as managers, developers, and customers, whenever the incident status changes. This could include updates when the incident is first identified, when it's being investigated, when a fix has been implemented, and when it's been resolved.

Make sure to explain what each step means for each stakeholder. For example, does this update mean developers could still be needed to come work on the incident, or is the incident past the point where that escalation is needed? Do customer-facing teams need to announce progress to customers with this status update?

Incident communication template for status updates

Notice to [stakeholder group]:

The incident [incident name] has been moved from [previous status] to [current status]. This means that [brief definition of that incident status].

Please note that [description relevant to specific stakeholder group]. Stay tuned for further updates.

Example:

Notice to management.

The incident Login Service Outage has been moved from In Progress to Resolved. This means that a solution has been found and implemented and functionality has been fully restored.

Please note that this means management will not be needed to escalate further for this incident. Stay tuned for further updates.

Critical Incident Alerts

When something goes very wrong, which requires escalation or delays resolution estimates, it’s important to let people know without slowing down the response process. Setting up incident communication templates that trigger when the incident severity changes makes sure that all the relevant stakeholders know the update without needing them to seek out updates or distract on-call engineers. Then they can prepare messaging to customers or allocate other resources as necessary.

Communication template for incident alerts

Notice to [stakeholder group]:

The incident [incident name] has been escalated from [former severity] to [current severity]. This means that [brief description of that severity].

Please [that group's escalation policy for that severity]. Stay tuned for further updates.

Example:

Notice to engineering leads.

The incident Server Outage has been escalated from Sev3 to Sev2. This means that we are expecting some amount of customer impact and anticipate a longer resolution process.

Please prepare engineers to potentially be called in to work on the incident as required. Stay tuned for further updates.

Customer Impact Alerts

Sometimes incidents will change from being internal to customer-facing. For example, you could detect an incident that causes some servers to be unavailable, but the service could still have enough servers for customers to use it. Then, during the process of fixing the servers, demand could increase or more servers could fail to the extent that customers start experiencing outages.

When this happens, the severity of the incident can be easily changed, which can trigger CommsFlow(™) to automatically send a customer impact alert. Customer success and PR teams may need to start writing responses any time customers are affected, so they need to know as soon as possible.

Communication template for customer alerts

Notice to [stakeholder group]:

The [severity level] incident [incident name] is now impacting [customer groups]. Please prepare to contact affected customers. Stay tuned for further updates.

Example:

Notice to customer success teams. The Sev2incident Search Outage is now impacting our subscribed users. Please prepare to contact affected customers. Stay tuned for further updates.

Incident Summary Reports

Another helpful template is an incident summary report. This template can be set up to automatically send a summary of the incident to all stakeholders once the incident has been resolved. The report could include information such as the root cause, how the incident was resolved, and any actions that are being taken to prevent similar incidents from happening in the future. This can also be sent to management, engineering, and other departments for further analysis.

To enhance the information provided by this update, link the corresponding incident retrospective to give people the full story of the incident. Talk to people who review retrospectives and see what info they’re going there most often for, and consider including that information in the CommsFlow template for ease of access. You could also highlight reliability insights dashboards that were impacted by the incident to give stakeholders context in how this incident compares to others.

Summary report templates you can use

Notice to [stakeholder group]:

The incident [incident name] has been resolved. This incident was a [severity level] incident affecting [technical area].

For more information, please see [retrospective link].

Incident Retrospective Review Invitations

Finally, retrospective review invitations can be set up to automatically invite stakeholders to participate in a review of the incident once it has been resolved. This can help ensure that all stakeholders have an opportunity to provide feedback and suggestions for how the incident could have been handled better in the future. These can be triggered when the incident’s status changes to “resolved”.

Ultimately, you want each incident to lead to systemic changes that mitigate future incidents like it. Having post-incident review meetings and retrospective documents give you an opportunity to identify and assign follow-up tasks.

Using CommsFlow(™), teams can set up custom messaging templates that are tailored to the specific needs of their organization. This can help ensure that stakeholders are kept informed and updated during incidents, while reducing the stress and workload on the incident response team. By using these messaging templates, teams can communicate more effectively during incidents and ultimately minimize the impact of an incident and resolve it more quickly.

To see CommsFlow(™) in action, start your free trial today!

Resources
Book a blameless demo
To view the calendar in full page view, click here.