Navigate Incident Management Like a Pro: MyFitnessPal's Sr. Director of Engineering Shares Insider Strategies with Lee Atchison
How much time are engineering teams spending on incidents?
Are you trying to set your engineering team free to do their best work? Read our new case study to learn how Blameless can help you do that.
Case Study

BambooHR & Blameless: “Setting people free to do great work”

Create better alignment across the organization on incident response process
Make tracking and reporting on incident data simple for the whole team
Automate important portions of the incident response workflow
Facilitate better retrospectives and encourage more consistent follow-up actions

Starting From Core Principles

Dave Petersen, the Director of IT Operations at BambooHR, appreciates the complexity of building and maintaining complex systems. Dave’s experience in ITOps and system administration stretches back to the mid-90’s. In joining BambooHR, he’s discovered an interesting alignment between his role as an IT leader and his company’s mission.

{{quote1}}

BambooHR was founded in 2008 by Ben Peterson and Ryan Sanders to empower HR pros, employees, and organizations everywhere to simplify complicated processes and streamline time-consuming tasks so they can focus on supporting their people and thriving in their roles. If there ever was a corporate mission that aligns better with the philosophical goal of an IT Operations leader, we haven’t heard it.

Of course, Dave isn’t alone. With the help of process engineer Jordan Jones and systems engineer Weston Sundwall, The IT Operations group at BambooHR is on its own mission to set its engineering organization free to do great work by making incident management a mechanism for continuous improvement rather than a never-ending process of fighting fires.

Understanding The Challenge Confronted by BambooHR

The last several years have been a period of rapid growth for BambooHR. In 2022, they became the #1 HR solution in EMEA and APAC for tech customers.1 This rapid growth brought forth some significant new challenges in incident management for Dave Petersen and the team.

{{quote2}}

That pain Dave describes was felt across the entire engineering group as system engineers like Weston Sundwall and on-call engineers for customer-facing microservices battled with a growing number of disruptions.

{{quote3}}

The company recognized the need for a suite of tools that automate workflows around incident management and create better feedback loops for incident learning in the process. This need became acute as the frequent context switching between engineering and firefighting modes was disrupting innovation and development schedules, while the lack of a process for tracking minor incidents was leading to repeated, unaddressed disruptions.

{{quote4}}

Hunting For The Right Solution

The first phase of the incident program was an in-depth assessment of BambooHR's current incident management processes. This stage was crucial for understanding the gaps in their existing system and identifying specific areas where a tool could bring immediate improvements. The assessment was spearheaded by Jordan Jones, process engineer, and project and program manager for the incident management program.

{{quote5}}

In their opening assessment, it was clear to Dave and Jordan that for them to be successful in transforming incident management at BambooHR, they needed to foster a culture of continuous improvement. To do that, they were going to need the right tools. In particular, a set of incident management tools that helped:

{{tools}}

It was also clear to Dave that if they were successful, it would mean a dramatic reduction in the negative impact incidents were having on the team.

{{quote6}}

Blameless Setting Incident Responders Free To Do Great Work

In response to these challenges, BambooHR chose Blameless, a tool designed to transform incident management into a mechanism for continuous improvement. The decision was driven by Blameless's ability to automate incident response tasks like acknowledgment, creation of channels, assembly of the team, routine stakeholder updates, and data collection.

{{quote7}}

Blameless's retrospective functionality was also a major factor in the team's decision, as it promised to automate the more time-consuming elements of data collection while making it easy for the team to set standards around and collaborate during the retrospective process.

{{quote8}}

Like the retrospectives function, being able to integrate broadly across the rest of their engineering tools and scale to support teams working both on internal IT and the customer-facing product weighed heavily in the team's final decision to choose Blameless.As part of the rollout, BambooHR also planned for the future scalability of the incident management system. The goal was to have Blameless used not just for internal IT incidents but also for managing incidents related to the customer-facing product. This broad scope required a flexible and scalable solution that could adapt to the growing needs of the company.

{{quote9}}

Finally, the Blameless team showed up to support BambooHR in their evaluation process every step of the way. Offering guidance on best practices during evaluation and implementation.

{{quote10}}

The Blameless & BambooHR Partnership Bears Fruit

As the BambooHR team has begun applying Blameless to their incident management program, The primary goal is to enter a continuous improvement cycle where incidents, big or small, are consistently recorded and analyzed. This approach turns every incident into an opportunity for learning and improvement, thus reducing repetitive incidents and enhancing overall operational efficiency. The integration of Blameless with existing tools like Jira streamlines the process further, ensuring seamless management of action items and improvements post-incident.

Additionally, automating the detection and initiation of incident management is a crucial objective. This move is expected to reduce the time to recovery and minimize errors or misclassifications in incident handling. The broader goal is to shift from manual to automated incident creation, which will serve as a key metric of success for the implementation of Blameless.

{{quote11}}

In terms of communication, Blameless is expected to bring alignment across the organization regarding incident management progress. By providing clear visibility and reporting on the state of incidents, the tool will help reconcile differing perceptions about the severity of the incident management problem within the company.

{{quote12}}

1

Standardization And Automation Of Incident Management Process

This is some text inside of a div block.
If you go to our website and look at our story, it's right there in giant words: “BambooHR exists to set people free to do great work”. Within our IT Operations group, we apply that mission to support both our employees and our customers. Our work is meant to help them achieve great things in their career and their company.”

Dave Petersen

Director of IT Operations, BambooHR

“With the expansion, it became increasingly difficult to notify and communicate with the right people during incidents. Identifying the appropriate personnel to address these issues was also becoming a complex task due to the sheer number of employees compared to just a few years prior.”

Dave Petersen

Director of IT Operations, BambooHR

"My team is the primary team on-call for our systems infrastructure...and by default, we'd be brought into fires where we’re trying to figure out what's broken, what team should be assigned, where updates should be going. It was kind of a snowball effect of trying to determine whether my team should even be involved, if not, who should be? By the end of it, you might have a war room occupied by several people for an extended amount of time that never needed to be there."

Weston Sundwall

Systems Engineer, BambooHR

“We were struggling on a number of fronts. We didn’t have clear visibility to how many minor incidents were occurring across our team. It takes us off of that innovation track when we have to constantly stop, switch back to a firefighting mode, figure out what to do, and try to fix that. We were also having a hard time right-sizing our incident response. We had too many people involved, which was also putting a drain on our resources.”

Dave Petersen

Director of IT Operations, BambooHR

"When I was first approached by Dave and asked if I would help with the incident management program, I had a lot of questions like “How many incidents do we get on average?”. “How big are the incidents?” Dave’s response was essentially, “that's what we’re going to find out!". We started trying to gain a comprehensive view of the frequency and nature of incidents within the organization. This highlighted a lack of visibility into minor incidents and the need for a more structured process to record and analyze all incidents, regardless of their size.”

Jordan Jones

Process Engineer, BambooHR

“We saw a need for a suite of tools to help us automate those workflows. When it comes to incident management, humans add latency and inconsistency. And those are the two things you want to eliminate. Not that we’re looking to take the human out of the equation. Rather, we wanted to automate all the tasks they’d need to do consistently so they could focus on resolving that incident. Also, If you can't measure something, it's difficult to know where you need to improve. Having a tool that helps us standardize the engineering group on a consistent process allows us to get the data we need to start building a better company."

Dave Petersen

Director of IT Operations, BambooHR

“When incidents happen, we want to be very responsive, get to that root cause as quickly as possible and the resolution of the incident, and we also want to do it the same every single time. So that the response is as thorough and consistent as possible from incident to incident. We saw Blameless as the best tool to help us to do that”

Dave Petersen

Director of IT Operations, BambooHR

"The retrospective functionality was really exciting when I started to learn about that from Blameless. While retrospectives are the most unpopular piece of a project or program management, it is absolutely the most valuable, and I've always gotten good information when I've conducted retrospectives."

Jordan Jones

Process Engineer, BambooHR

“Ultimately, we want to have our monitoring and alerting systems tuned enough that we can detect incidents and immediately start the incident management process without anybody having to do anything. Blameless can help make that possible.”

Dave Petersen

Director of IT Operations, BambooHR

“In my conversations with Matthew, I think he's been really great just in general of obviously taking us through the product but also when we've had questions about specific things where I was troubleshooting or couldn't get something set up, he'd answer the question and take an aside and be like this is kind of maybe a best practice or something that we focus on where we think our software really shines and these are some ideas of things you could implement here”

Weston Sundwall

Systems Engineer, BambooHR

“Anytime we can reduce the amount of resources it takes to properly manage an incident and also communicate the status of that incident out is a huge win for us as a company."

Dave Petersen

Director of IT Operations, BambooHR

"I'm really excited about using visibility and reporting to align all of these stakeholders and provide something that clearly says we've quantified the situation and we're taking steps to make it better”

Jordan Jones

Process Engineer, BambooHR

Create better alignment across the organization on incident response process
Make tracking and reporting on incident data simple for the whole team
Automate important portions of the incident response workflow
Facilitate better retrospectives and encourage more consistent follow-up actions