It’s been a few months since I joined Blameless and now that I’ve got my footing, I figured it’s a good time to share why I joined and what I think the future holds – now that I’m part of it.
In recent team interviews across all levels, a specific question keeps coming up: “Why did you join Blameless and what got you initially excited?”
I’ve been reflecting on this very question over the last few days and realized what got me excited early on just scratched the surface compared to what gets me excited now.
Reducing Communication Overhead and Toil
What initially got me on a call with Lyon (CEO) was the idea of reducing overhead when running an incident, such as handling an outage. It always seemed so wasteful to have to run essentially “information errands” to collect data related to what’s actually occuring during an outage, then package, and report just to keep everyone calm. Isn’t there a better way to automate this? How can I have this “just happen” so I can focus on shielding my team while they fix the problem?
This was the first thing that got me thinking about Blameless, and rightly so. However it turns out there's much much more.
The Iron Man Suit
During my time at OpenDNS, I thought we did a pretty great job running a world class independent recursive DNS service. That said, we had cobbled together a set of tools and best practices that were hard won by a set of (respectfully) grizzly veterans who had been in the trenches for years and seen almost everything. These folks had cut their teeth on world class SaaS as it was being invented, and had the scars and knowledge to show for it. We knew how to run an outage/degradation, had created SLOs ourselves (though they weren’t called that yet), had a codified retro process, and built years of runbooks to onboard new folks and make them effective quickly.
However, not everyone can hire, retain and sustain this kind of team in today’s complex and ever-changing tech stack because everyone needs a world class operations team to run and maintain their SaaS offerings. If COVID taught us anything, it’s that most people came to realize they needed to go all-digital much sooner than they anticipated. Even companies contemplating 5 years out decided to begin now as physical experiences have pretty much evaporated. When I was at Kong, we saw a massive upswing of adoption for our API management platform because the world had changed and if you didn’t change within it, you went out of business like thousands of other companies who couldn’t move fast enough.
But there was another more frightening trend that came along with this acceleration of digital transformation (yeah, I know, it’s yesterday’s buzzword): a majority of these teams weren’t equipped with the experience to do it right. There simply wasn’t enough talent going around with the right experience level to furnish every new, shiny digital platform with a world class team to run it out the gate, so companies struggled to get up and running quickly.
What I heard from Lyon in the early days after I joined was inspiring - a vision for Blameless to provide teams with a virtual “Iron Man Suit” – in other words, an operational advantage based on best practices built by the world’s best operations teams. Tony Stark is just a man (albeit a genius) but when he puts on the suit, he becomes a superhero capable of what seems impossible!
By providing a platform to uplevel reliability engineers and operations teams, we have the opportunity to improve the quality of every online experience. And if the last two plus years are an indication of what’s to come, that’s a pretty big deal. What’s more, we also reduce burnout, making the lives of potentially millions of engineers better which translates to a better quality of life.
That’s the real goal - Improve the lives of people.
Of course, the opportunity to create a new market is a pretty exciting future, and perhaps a new Gartner Magic Quadrant wouldn’t hurt either (smile).