The Blameless Blog
In a recent industry leaders’ roundtable hosted by Blameless, top experts discussed best practices for responding to incidents, scaling for reliability, and how to engineer with the customer in mind.
In this blog post, I’m excited to share a project that I’m working on to reduce our cloud spend. Let’s take a look at some of the non-outage ways we use Blameless Incident Management to manage, track, and keep a historical record of projects that might have durations of weeks or months, versus nail-biting minutes or hours.
There was a rushed stillness in the air. That odd pre-tsunami silence briefly collapsed around us. Then, pandemonium: blasts, screams, horns, cheers... within minutes of each other, Los Angeles had won the World Series and Matt Davis accepted a role as Senior Infrastructure Engineer at Blameless.
In a recent fireside chat with Mohan Bhatkar, Head of Engineering for the Customer Reliability Platform at Mercari, Inc. sat down with Blameless Co-Founder Ashar Rizqi. They talked about scaling while avoiding silos, exciting day-to-day challenges, instilling a culture of empowerment, and more. Here are their top insights and the lightly edited transcript of their conversation.
We’re drinking Pumpkin Spice Lattes, lighting candles, and wearing flannel. Oh, and reading a bunch of great stuff. Here’s the November issue of SREview! This monthly zine features epic Tweets, content, and events happening in the SRE and resilience engineering community.