The Blameless Blog

Failure Is Not An Option Inevitable

Featured Post

SRE as Organizational Transformation: Lessons from Activist Organizers

Activist organizers are in the business of changing minds and behaviors, leading decision-makers and traditional power holders in new directions. Here’s a curated list of their tips and practices that you can use to bolster your company’s transformation efforts.
Mar 3, 2021
SRE as Organizational Transformation: Lessons from Activist Organizers

Activist organizers are in the business of changing minds and behaviors, leading decision-makers and traditional power holders in new directions. Here’s a curated list of their tips and practices that you can use to bolster your company’s transformation efforts.

Mar 2, 2021
SRE2AUX: How Flight Controllers were the first SREs

This is the story of the first 10 minutes of that mission, Apollo 12. Now you might be wondering, what does all this vintage space lore have to do with site reliability engineering in the 21st century? Well, I invite you to read on and find out.

Feb 23, 2021
SREview Issue #10 February 2021

Is love in the air? We think so. While we don’t have chocolate or flowers for you, we have something just as sweet. Here are some of the most exciting Tweets, content, and events happening in the SRE and resilience engineering community this February.

Feb 22, 2021
QA Engineers, This is How SRE will Transform your Role

In this blog post, we’ll break down how SRE transforms the role of QA, and highlight the improvements it brings for the team.

Feb 17, 2021
Getting Started as an SRE? Here are 3 Things You Need to Know.

In this blog post, we’ll look at key comprehensions and skills for an SRE, positions and credentials that can develop into the SRE role, and the career paths of some successful SREs.

Feb 16, 2021
4 Things you Need to Know about Writing Better Production Readiness Checklists

In this blog post, we’ll cover how to make a production checklist, why production checklists are helpful, keeping your checklist up to date, and how Blameless can help integrate your checklists.

Feb 9, 2021
4 Tips on Preparing for a [Great] Failure

In this blog post, we’ll look at SRE techniques for mitigating the impacts of system failure, including building runbooks, assessing with SLOs, monitoring metrics, and building a blameless culture.

Feb 8, 2021
Communication Tool Down? Here are 3 Ways to Handle it

With the rise of microservices, our systems rely on a complex ecosystem that involves many third-party vendors and tools. Some of the most important tools teams use daily are those for collaboration. So how can we plan for failure in systems outside of our control?

Feb 2, 2021
"I'm Just Doing my Job," An SRE Myth

SRE can help ensure that teams are customer-focused, even if the best way forward breaks the rules or requires you to re-write them. Two ways SRE accomplishes this are by fostering a culture of blamelessness and using SLOs to glean insights into the customers’ experience.

Jan 26, 2021
Who Else Wants to Increase Development Velocity?

In this blog post, we'll look at how SRE tightens feedback loops and decreases friction, and how development velocity generates business value.

Jan 25, 2021
Have a Cloud Transition you can be Proud Of

In this blog, we’ll look at how SRE helps with migrating and operating in the cloud, as well as share some tips on how to maximize reliability.

Jan 19, 2021
SREview Issue #9 January 2021

New year, new SRE! We’ve said goodbye to 2020 and hello to 2021. Here’s some of the most exciting Tweets, content, and events happening in the SRE and resilience engineering community so far this year.

Jan 18, 2021
Top Reliability and Scaling Practices from Experts at Citrix, Greenlight Financial Technology, and Incognia

In a recent industry leaders’ roundtable hosted by Blameless, top experts discussed best practices for responding to incidents, scaling for reliability, and how to engineer with the customer in mind.

Jan 12, 2021
This Is the Most Underappreciated Skill for SREs

It's important to learn what goes into glue work and focus on ways to appreciate those who do it. In this blog post, we’ll highlight some examples of glue work SREs perform: building a common language, forging connections, and establishing culture.

Jan 11, 2021
The Secret of Communicating Incident Retrospectives

In this blog post, we’ll show how to coordinate incident retrospectives across different stakeholder groups, how to cultivate a culture of blamelessness during the process, and how to drive change from key findings

Jan 7, 2021
How Mercari Scales Vision, Culture, & Reliability (日本語)

先日、メルカリのカスタマーリライアビリティプラットフォームのエンジニアリングヘッドであるMohan Bhatkar氏と、Blamelessの共同共同創業者であるAshar Rizqi氏との対談が行われました。そこでは、部門の壁を取り払いつつ規模を拡大する方法、日々直面している刺激的な課題、エンパワーメントの文化を根付かせることなどが語られました。その中から主な洞察や、短く編集した対談の一部をご紹介します。

Jan 5, 2021
Little Known Ways to Better Use Your Error Budgets

In this blog post, we’ll look at how error budgets can help cross-functional teams across the organization such as QA, legal, executives, and more. We’ll also look at ways engineers can use error budgets beyond development planning.

Jan 4, 2021
Modern Operations Best Practices from Engineering Leaders at New Relic and Tenable

In a recent industry leaders’ roundtable hosted by Blameless, top experts discussed how teams can embrace SRE best practices and make cultural shifts towards blamelessness

Dec 17, 2020
How to Cut Cloud Costs for 2021 Using Blameless

In this blog post, I’m excited to share a project that I’m working on to reduce our cloud spend. Let’s take a look at some of the non-outage ways we use Blameless Incident Management to manage, track, and keep a historical record of projects that might have durations of weeks or months, versus nail-biting minutes or hours.

Dec 11, 2020
How I Joined Blameless, Matt Davis, Senior Infrastructure Engineer

There was a rushed stillness in the air. That odd pre-tsunami silence briefly collapsed around us. Then, pandemonium: blasts, screams, horns, cheers... within minutes of each other, Los Angeles had won the World Series and Matt Davis accepted a role as Senior Infrastructure Engineer at Blameless.

Get the latest from Blameless

Receive news, announcements, and special offers.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.