The blameless blog
What Are MTTx Metrics Good For? Let's Find Out.
MTTx metrics rarely tell the whole story of a system’s reliability. To understand what MTTx metrics are really telling you, you’ll need to combine them with other data. In this blog post, we'll share some alternatives to the basic MTTx metrics you might be using.
April 13, 2021
Postmortems Now Called Retrospectives in Blameless
Something big happened at Blameless this month — our “Postmortem” feature was updated to its new name, “Retrospective”. To the naysayer, I suppose you’re thinking, This seems trivial. Different teams call it different names anyway, so why bother making the change? First let me say, thank you for reading our blog and I hope you finish this one through to the end. Now, allow me to explain our reasoning and why we’re excited about this update.
March 2, 2022
SRE and Fighting Games
When learning SRE, you might find its principles a bit unintuitive. For example, it might be difficult to learn why aiming for 100% reliability is wasteful, or how reliability isn’t the same as availability, or why failure ought to be celebrated. Believe it or not, there is a method to these ideas. My goal in this article is to shed light on the principles and to leave you a believer, such that you’ll take steps towards starting SRE practices.
October 26, 2021
Situation Room: On-Call Team Faces Worst Case of Sunday Scaries
Ask incident responders to share stories of standout experiences on the job, and you’ll hear wonderful anecdotes. That’s precisely what we did. Our latest blog recounts a real story from one of our own engineers.
August 31, 2021