Navigate Incident Management Like a Pro: MyFitnessPal's Sr. Director of Engineering Shares Insider Strategies with Lee Atchison
How much time are engineering teams spending on incidents?
Are you trying to set your engineering team free to do their best work? Read our new case study to learn how Blameless can help you do that.
Case Study

MyFitnessPal Giving Power Back to Engineers with Blameless

Chris Karper, Senior Director of Platform Engineering

With over 25 years of experience in software architecture and engineering leadership, Karper has been instrumental in steering MyFitnessPal through all of its technological evolutions, particularly in the realm of incident management and engineering processes.

Introduction to MyFitnessPal and Chris Karper

MyFitnessPal, a pioneer in the mobile health and fitness space, stands as a testament to technological advancement in personal wellness. At its helm in the platform engineering department is

{{bio-cell}}

The Pre-Blameless Era

Prior to the adoption of Blameless, MyFitnessPal faced significant challenges in incident management. As Karper recounts, "We had several different departments who worked together, but not always well", leading to fragmented and inefficient processes. The lack of standardized incident response playbooks contributed to a chaotic environment where engineers were battling stress and uncertainty whenever something broke. The wider organization was beginning to feel this in the form of increased engineer attrition. The lack of clear processes and communication led to a reactive and ad-hoc approach to incident management.

Implementing Blameless

Recognizing the need for a change, MyFitnessPal turned to Blameless, in the hopes of harnessing standardized incident response playbooks, consolidating best practices, and introducing a level of automation that was previously absent. Blameless was integrated primarily through Slack, which was already a central workspace for the engineers. This integration allowed for a smoother transition and immediate engagement with the new system.

Benefits of Adopting Blameless

1

Standardization And Automation Of Incident Management Process

One of the most significant outcomes of implementing Blameless was the cultural shift within MyFitnessPal moving towards greater decision-making authority for engineers. Karper highlights the importance of this shift:

{{quote2}}

1
Moreover, the availability of incident data and the standard for exploration and reporting that comes with that, have had a big impact on the willingness of individual engineers to participate openly in retrospectives.

{{quote3}}

2

Standardization and Automation of
Incident Management Processes

Blameless also helped standardize the incident management processes across MyFitnessPal, creating uniformity and predictability. Karper notes the success of this standardization: “It gave us one playbook to follow that worked for everybody.”

1
The automation aspect also reduced the tedium and potential for human error, streamlining the incident response.

For the MyFittnessPal team, Blameless plays a crucial role in automating the more tedious aspects of incident management. Providing a structured framework for handling incidents. This not only makes the incident response process more efficient but also allows engineers to focus on their craftsmanship and reduces the burden of ad hoc decisionmaking during high-pressure situations.

{{quote5}}

3

Data-Driven Insights for Service Quality Improvement

Data-driven decision-making is a core tenant of MyFitnessPal’s engineering process. As Chris Karper puts it:

{{quote6}}

1
Blameless made it simpler for Chris and team to aggregate incident data and put that data to use. This, in turn, enabled MyFitnessPal to make informed decisions about service maintenance and technical debt prioritization, shifting from subjective judgments to objective analysis. "The biggest difference comes... from the aggregation of incident data... being able to look at those longer-term trends is really where I find the value. And having just one place where all of our incident data lives."

4

Structured Learning from Incidents

Finally, Blameless enhanced the organization's ability to learn from incidents. Karper passionately states, “...incidents are the one wholly-owned most important resource for learning.” The structured approach to incident retrospectives provided by Blameless allowed for deeper insights and continuous improvement.

Conclusion

The adoption of Blameless by MyFitnessPal under Chris Karper’s leadership marked a pivotal turning point in the company's approach to incident management and engineering culture. The transformation led to a standardized and automated process, a shift towards a culture that prioritizes engineering happiness, the ability to make data-driven decisions, and a structured approach to learning from incidents. These changes not only streamlined their operations but also profoundly impacted the satisfaction and effectiveness of their engineering team, setting a new standard for incident management in the tech industry.

1

Standardization And Automation Of Incident Management Process

This is some text inside of a div block.

Chris Karper, Senior Director of Platform Engineering

With over 25 years of experience in software architecture and engineering leadership, Karper has been instrumental in steering MyFitnessPal through all of its technological evolutions, particularly in the realm of incident management and engineering processes.

"We had several different departments who worked together, but not always well."

Chris Karper
Senior Director of Platform Engineering

“All of these tools, all of that communication providing all this data around service quality is all with the goal of making our engineers feel like there's less burden on them... It's not a debate anymore. There's objective data saying they're right.”

Chris Karper
Senior Director of Platform Engineering

"Everybody going into an incident should feel psychologically safe. We can talk about what we did right and wrong, and I think blameless in the process makes that a lot easier, and it makes people feel structured and supported."

Chris Karper
Senior Director of Platform Engineering

“It gave us one playbook to follow that worked for everybody.”

Chris Karper
Senior Director of Platform Engineering

"The engineers who use it are actually really passionate about it. They love it... It automates the tedium. It helps give that structure. They know they're gonna move through the incident in the right order because it's all like right there for them in slack."

Chris Karper
Senior Director of Platform Engineering

"A lot of time here has been spent on trying to figure out how to make the prioritization of service maintenance tackling tech debt... a really data-driven process... It's easy for us to bubble up and say, this is what everybody thinks we should be working on. But... leaders wanna see a number, they wanna justify it."

Chris Karper
Senior Director of Platform Engineering

"The biggest difference comes... from the aggregation of incident data... being able to look at those longer-term trends is really where I find the value. And having just one place where all of our incident data lives."

Chris Karper
Senior Director of Platform Engineering

“...incidents are the one wholly-owned most important resource for learning.”

Chris Karper
Senior Director of Platform Engineering