Want to up-level your reliability program? Let's start by identifying your opportunities for growth.
How much time are engineering teams spending on incidents?
Are you trying to set your engineering team free to do their best work? Read our new case study to learn how Blameless can help you do that.
The Blameless Podcast

Resilience in Action E11:

Applying SRE Principles to Other Aspects of Work and Life with Jennifer Petoff
RIA Episode 11

Applying SRE Principles to Other Aspects of Work and Life with Jennifer Petoff

October 27, 2021

Kurt Andersen

Kurt Andersen is a practitioner and an active thought leader in the SRE community. He speaks at major DevOps & SRE conferences and publishes his work through O'Reilly in quintessential SRE books such as Seeking SRE, What is SRE?, and 97 Things Every SRE Should Know. Before joining Blameless, Kurt was a Sr. Staff SRE at LinkedIn, implementing SLOs (reliability metrics) at scale across the board for thousands of  independently deployable services. Kurt is a member of the USENIX Board of Directors and part of the steering committee for the world-wide SREcon conferences.

Kurt Andersen:

Hello. I'm Kurt Andersen. Welcome back to Resilience in Action. Today, we're talking with Jennifer Petoff, who is currently the Director of SRE Education for Google. She's one of the co-editors of the bestselling book, Site Reliability Engineering, and she's a lead author of Training Site Reliability Engineers. Both of these will be linked in the show notes. Welcome to the podcast, Jennifer.

Jennifer Petoff:

Ah, thanks so much Kurt. Really great to be here today.

Kurt Andersen:

You have a very interesting and varied background. Can you tell our listeners a bit about yourself?

Jennifer Petoff:

Sure. So yeah, a little bit about myself. So you gave the bio standard boiler plate or whatnot, something you might not know about me. I'm Jennifer Petoff, my friends call me Dr. Jay, because I actually have a PhD in Chemistry from way, way back in the day. I've been at Google for over 14 years now. Time flies when you're having fun, I suppose. And again, started that career in chemistry, working in the lab, working on things that could literally start on fire or explode if you expose them to air. And that was literally quite exciting, but I'm happy to say that my role today really is equally exciting, just not in that same physically dangerous way.

Jennifer Petoff:

And so what I do today is I currently lead the global SRE Education team for Google or what we call SRE EDU. And yeah, you mentioned that I worked on the SRE book which we published back in 2016 and I do have to say that was the coolest project that I've worked on at Google. Other things to know about me, when we aren't living with the pandemic, I love to travel and I'm a part-time travel blogger at Sidewalk Safari.

Kurt Andersen:

Oh, cool. We'll circle back to that a little later, but can you tell us a bit more about your journey and how it's evolved over 14 years at Google?

Jennifer Petoff:

Yeah, it has definitely been an evolution and my career at Google really expand three offices on two continents. So I started at our headquarters in Mountain View, I worked up in the city in San Francisco for a while, and then moved to Dublin. Dublin Ireland, not the East Bay for those who happen to be tuning into the podcast from California. And during that time, and across those geographies, I've worked in a variety of functions. I actually joined Google as a member of the university programs team back in 2007. And I worked to hire MBAs PhD and PhD grads. And from there I actually moved on to DoubleClick pretty soon after Google had acquired DoubleClick and I developed curricula and taught publishers how to use DoubleClick's ad serving products.

Jennifer Petoff:

And next up moved to Ireland where I ran operations projects for the AdWords global customer services team. Until in 2014, I actually made my way to SRE as a program manager. And at that point I never really look back. I really feel like now I've found my home within Google. I've spent over half of my 14 year career at Google and SRE and just really, really love it.

Kurt Andersen:

Great. You mentioned program manager, every company's seems to have a slightly different spin on what that kind of a role consists of. Can you give us some idea of what it looked like for you?

Jennifer Petoff:

Yeah. So being a program manager at Google... I mean, it's how do you herd the cats, if you will? How do you take something from a nebulous and ambiguous idea and turn it into something real basically?

Kurt Andersen:

Okay.

Jennifer Petoff:

So I think, there's... Yeah. And it's a lot of, how do you build trust and credibility in working with people and, yeah, drive key initiatives throughout the organization.

Kurt Andersen:

Okay, cool. So how'd you go from a Chemistry PhD to leading the [inaudible 00:04:26] team that assembled such a bestselling book and it seems like it's been [eons 00:04:31] but 2016, that's only five years but for the Site Reliability Engineering book.

Jennifer Petoff:

Yeah. So yeah, it's hard to believe it's been five years since the book. It feels like just yesterday, but yeah, like I say, time does fly when you're having fun. But like you say, it has been a bit of an evolution. So going from that PhD in Chemistry to ending up in SRE. And I think it all boils down to the fact that I tend to make my career decisions based on how they advance my life goals rather than letting my life serve my work. So if you took a look at my profile on LinkedIn, you might think, how does all this stuff fit together? This is a really weird career. And the truth is I've reinvented myself numerous times during the course of my career, largely in service of those life goals.

Jennifer Petoff:

And one of my key life goals for example, was to live in different cool places, travel the world. And that really is a guiding star that I used to evaluate career opportunities. So is this thing that I'm thinking about doing, is this opportunity going to take me closer or further away to this goal, to live in different cool places and travel the world? And I've been fortunate in that I've been able to trade pretty heavily on my transferable skills and the foundational skills that I developed during my chemistry PhD research. I don't know if you could-

Kurt Andersen:

So, can you expand on that?

Jennifer Petoff:

Oh go ahead. Go ahead.

Kurt Andersen:

Yeah, I was going to say, can you expand on how Chemistry PhD research gives you skills that apply to running a book or running an educational program for SREs?

Jennifer Petoff:

I sure can. I sure can. Because I think being a grad student and particularly in the sciences teaches you a lot of things. So first and foremost, the scientific method. So how do I develop a hypothesis to test? How can I set up experiments? How do I launch and iterate? It teaches you analysis. So how do I draw conclusions with data? It could be research data, it could be business data, it teaches you observations. So how do you see and document the world? Like, so looking for clues. Like, there's also an element of perseverance. So when you're doing research, 90% of what you try doesn't work and you really need to stick with it. And I think that was very true of writing the SRE book. Even when we were getting started thinking like, "Are we ever going to get this over the line? How are we going to crowdsource this and write a thing that ends up in the hands of the public?"

Jennifer Petoff:

So perseverance at troubleshooting. So when something doesn't go to plan, figure out what's wrong and how to fix it. That's a key element that I learned in my PhD program. And of course, oral and written communication is super key when you're a grad student. And you're defending your ideas with data. And finally, it's all about learning fast as well. So, at the start of my PhD program, I knew very little about the topic, but by the time you're done, by the time you write your thesis, like you're a world expert in certain cases often on a teeny tiny little esoteric area, but you've gone from zero to expert in some cases. So I've been able to use that to really ramp up in different areas and learn, go from knowing nothing about Site Reliability Engineering to now knowing quite a bit and speaking about it publicly and such.

Kurt Andersen:

Okay. Okay. And so did you go straight from the lab to Google in your career then?

Jennifer Petoff:

No, actually there were a few pit stops along the way. You're right though. I didn't start my career very traditionally, so graduated with my PhD and then went to work at a large chemical company doing hands-on research. And then it was at that point, I started to notice some really cool volunteer opportunities that were available. So our HR team would send alumni back to campus to do technical presentations and perhaps try and recruit that next generation of graduates. And I thought at the time, "Ooh, that's a great opportunity to travel. Sign me up." I signed up to go back to my Alma mater and found that it was super fun. And from there, I volunteered to go to a bunch of different schools.

Jennifer Petoff:

I then parlayed that volunteer experience into a full-time role in university relations in the chemical industry. So got out of the lab, went into university relations. And a lot of it was just being in the right place at the right time. And in fact, I thought I'd do this for a couple of years and then, probably go back to research or maybe move into something different, but within the chemical industry. And it was at that point that fate really intervened. So I was an early adopter of LinkedIn. I found it, there was something so satisfying about making and visualizing professional connections. So I built up this crazy network on LinkedIn. And then one day a Google recruiter reached out to me. And of course my first thought was, what does Google want with a chemist?

Jennifer Petoff:

But I was curious, so took the meeting and then an interview and turned out they liked my combination of the physical sciences background combined with that practical university relations experience that I developed. And lucky me, the interviews went well and the rest is history. And I think from...

Kurt Andersen:

Nice.

Jennifer Petoff:

Yeah, I was going to say key takeaway, I think is keep an open mind. So, even if a career opportunity isn't the norm or it's unexpected, there's no harm in exploring and seeing where it might take you. So follow your heart. That sounds a little cheesy, but that's something I do believe in.

Kurt Andersen:

So how does it compare between working for a tech giant, although in 2007, I don't know that Google was that giant to place compared to the academic world.

Jennifer Petoff:

That's a good point. Google was still fairly big in 2007, much smaller than it is today, but you're right. But yeah. Thinking about it, there's definitely some similarities. I think Google prides itself on being a data driven company and moving from that culture of academia to tech wasn't the enormous leap that you might think it would be. And again, I am super fan of developing a hypothesis, designing some measurement to test that hypothesis. And even though what I'm hypothesizing about now is SRE learning and development programs rather than all within polymerization catalysis, which is what I did in grad school. The underlying techniques are really, really the same.

Kurt Andersen:

Okay. So obviously you're heavily invested in SRE education. And with your experience in university relations, in academia, I'm guessing you may have invested a little bit of thought about how universities can better prepare people for a career as an SRE. And that's been a challenge that's come up in multiple talks and people have difficulties hiring folks for the role. So what do you think universities could do to address a role like SRE and prepare people better?

Jennifer Petoff:

And I think that's a great question, Kurt. And I think that is definitely an interesting one. The thing that we see at Google is that you get people that come in as software engineers and they've learned how to do that in school. Like you say, they don't teach SRE in school. At least not today. But if universities were to teach SRE, I think, or if students want to teach themselves, I think the most important thing a student can do, if they're interested in becoming an SRE is to think about scale. So, what happens if this pet app or service that I'm working on really takes off? You don't want to have a success disaster on your hand.

Jennifer Petoff:

So instead of thinking about running things on your laptop or on small scale, think about what happens if we scale this up to a million or even a billion people. I think non-abstract large system design is something that's not often covered in a college curriculum. And I know the Google SRE team has actually developed some resources and exercises that folks could try for themselves. So if you check out sre.google/classroom, you'll find some fun activities there. In fact, sre.google has a bunch of different resources that hopefully you'll find useful to support both current and aspiring SREs. The other piece-

Kurt Andersen:

Those workshops-

Jennifer Petoff:

Go ahead, Kurt.

Kurt Andersen:

Those workshops when the team has done them at conferences have been really popular and well received so I hardly encourage people to go and check those out and get a group of people together to go through it together, not just individually because I think that partnership and working together is a aspect that is under-represented in academic training. Would you agree?

Jennifer Petoff:

No, that does make a lot of sense, Kurt because in yeah, in school maybe you're working on your own to get a particular grade or whatnot and yeah, once you work in industry and particularly as an SRE, you have to work with a wide range of people in different functions to keep these systems up and running and delivering for our customers. So yeah, definitely a critical skill to develop and working together, communicating that teamwork. The other thing that I think... Can I add one more?

Kurt Andersen:

Yeah.

Jennifer Petoff:

One more quick one here.

Kurt Andersen:

Please. Yeah, please. Yeah.

Jennifer Petoff:

I was just going to say, I think getting hands-on experience through internships is also a great way for students to get a sense of how do companies actually run their services in production and that's invaluable to an aspiring SRE.

Jennifer Petoff:

Even better. I know Google, we hire a students as into internships within SRE specifically. In fact, I leaded 20% project here in Ireland where we're actively recruiting interns from universities and institutes of technology around Ireland for SRE internships and it starts as early as the summer after your first year. So start early and get in there and get that experience often.

Kurt Andersen:

Great.

Jennifer Petoff:

The other piece I was just going to say is, I think there's two aspects to software development that folks need to think about. There's the ``what''s, which you might equate to think the shiny product features. And then there's the how. So how do you deliver those features to your users? How do you do it consistently, reliably? How do you manage that service to meet the needs of your users? So getting in the habit of thinking about the how, I think is an invaluable skill for an aspiring SRE, because once you launch your product or service, the work isn't done, it's only just beginning in fact.

Kurt Andersen:

I like that distinction between the what and the how with SRE being heavily focused on the how. That's a great handle that people can use. So you've also talked and published a bit about how the principles and best practices of SRE can be applied to everyday life. Walk me through a couple of those scenarios please.

Jennifer Petoff:

Oh, of course. Happy to do that. Again, I think once I joined SRE, I love how the principles and best practices are so rational and relevant to lots of different things. So I like to apply them in non-traditional ways. So a few examples. Yeah. And actually just to name a couple of those principles and practices first. So, number one, you can't improve what you can't measure, so you need robust monitoring, seeing failure as an opportunity to improve, not to brandish pitchforks that really gets to the heart of that SRE planless culture. And I think vanquishing toil through automation rather than feeding the machines with human effort is another key facet of an SRE practice. And I know for my team, the SRE EDU team, we actually pride ourselves on applying SRE principles to running those training programs so that we can scale the audience that we serve super linearly to the size of the team. So we can just really go for scale there and run a really world class program. I think you've seen my talks through that SRECON their Kurt.

Kurt Andersen:

Yeah. I have. And yeah, they're great, but I want to follow up quickly before we go on into some examples. The first principle that you can't improve, what you can't measure has gotten some flack or some pushback on the Twitter sphere with people saying, "Well, sometimes the most important things can't be measured." And I think mostly they're talking about you can't necessarily attach hard and fast numbers to some of these more qualitative aspects. Can you give me an example of how something like the success of an education program in your case for SREs can be measured? Because that seems very soft and fuzzy.

Jennifer Petoff:

And I think that's one of the pitfalls of running a training program. So there can be this unconscious bias that's, "Oh, that's easy. That's fluffy. That's not possible to measure. And maybe a better way to look at it is you can't improve what you can't observe or you can't evaluate. The ways that we monitor our programs is basically, you look at survey feedback, so you can take "Qualitative Survey Feedback." So either looking at net promoter scores or looking at self-reported increases in confidence before and after the training, and you can turn that into, what percentage of your audience reported higher confidence on key job related, job related tasks.

Kurt Andersen:

Okay.

Jennifer Petoff:

Again taking survey feedback and turning something that's more qualitative into something that you can then look at and say, "Okay, well, if we're trying to improve how people improve certain job related skills, are we succeeding in that area? Could we be doing something different? And if we're not doing it well enough, what changes can we make to drive improved outcomes for people?"

Kurt Andersen:

Okay. Okay. Makes sense. And it also seems like for something like a training program that you're talking a very different time scale than software engineers are used to thinking about for their systems or SREs for their systems. Because usually they're thinking in terms of minutes, hours, maybe days at the longer range. You're probably considering training program success in what? Months, years, something in that range?

Jennifer Petoff:

Yeah. Yeah. You're exactly right. So, you can evaluate instances of a class but if I think it's better to look at, how is the program doing over time? So for example, our orientation program, we run it every month in different locations around the world. So, what are the month on month trends? What are the year on year trends? We do look for evaluating survey feedback, for example, to look for any potential areas of concern that people may have, and we would address those more quickly. So it's important to monitor, observe on a very regular basis and then just keep in mind those trends that you're seeing over time.

Kurt Andersen:

So let's get specific, let's dive into some of these measures. I know that you're familiar with the site reliability hierarchy that shows up in the beginning of part three of the SRE book. If I'm remembering my reference correctly from Mikey Dickerson. Is it Mikey or Mikey?

Jennifer Petoff:

Mikey Dickerson. Yes.

Kurt Andersen:

Mikey Dickerson. Okay. So he starts off at the bottom with monitoring and I think that equates to the measures and the metrics that you're talking about, walk us through the rest of the pyramid. And then how do you apply that to your training system?

Jennifer Petoff:

Exactly. I think, you're exactly right. There's these parallels that you can draw between that service reliability hierarchy and what we're calling the training hierarchy of needs are just basically adapting that to the training context. So if you look at the original version, so the one that Mikey coined and illustrated. So you've got monitor at the bottom, followed by incident response, postmortem, and evaluating how things went after that incident, testing and release, capacity planning, development and finally at the pinnacle of your pyramid, you've got product. And again, we do that monitoring in the form of attendance tracking and survey feedback.

Jennifer Petoff:

We address any issues that surface. So again, if someone was like, "Oh, that instructor was terrible for whatever reason, or there was an issue if something was outdated with your content." We address that, those issues that surface. We write postmortems when things occasionally go wrong so that we can learn from failure. Maybe an instructor overslept or someone didn't show up to the class when they were supposed to. So, just making sure that things are always improving. We do a lot of testing, canarying and progressive rollouts and any new content in programs and all the while we're scaling our operations, we're looking for opportunities to vanquish the toil through automation.

Jennifer Petoff:

So, whether it's keying in sessions into our scheduling tool, whether it's making sure we're enrolling the students and the instructors and such. That can be very painful and toily, so how can we actually vanquish that toil through automation and make the most of our limited human resources? And really it's only then once we've done that, that our program's going to be fully actualized and we can realize that full potential of the curriculum design and the program itself.

Kurt Andersen:

So how about examples of other ways to use SRE principles in real life?

Jennifer Petoff:

Sure. Yeah. Like I said, I've taught about this a lot. I've spoken at a few conferences about this and stuff, Tweeted, et cetera, but ever since I joined the SRE team, I've noticed how those SRE principles, and best practices can play out in real life for both good and for bad. So I've done a talk at a few different conferences on SRE anti patterns in everyday life and what they teach us. You'd actually be amazed to see how well SRE principles and best practices fit the context of a busy and understaffed cafe. For example, one cafe, I was in New Zealand to speak at a conference and a cafe I would say, they didn't do a great job with their capacity planning, which essentially led them to feed their machine, the coffee machine in this case with human toil, fascinating to watch that play out as I was sitting there waiting for my coffee.

Kurt Andersen:

Excellent. Well, we will in the show notes, if those talks are available for people to watch, will add links in the show notes for those.

Jennifer Petoff:

Awesome.

Kurt Andersen:

I think your comment about being in New Zealand is a great segue into your alternate identity as a travel writer. How'd you get into that adventure and what's the most interesting place you visited and why was it particularly interesting?

Jennifer Petoff:

Ah, that is a good, good question. So, yeah, so I think how did I become a travel writer or a travel blogger? I think it was all about division of labor first and foremost, between my husband and I. He's the planner, I'm the historian. So he does all the research. He makes all our travel arrangements and then I take all the pictures, I document our trips. And I actually started Sidewalk Safari my blog about 10 years ago now to make it easier to consult the record, so to speak. I remember we'd be reminiscing about our past trips, and I remember things one way, he'd remember them another. And of course, by writing about the trips, we could figure out who was right. Usually me. No, just kidding. Not always me, but I have a better memory in some cases for details.

Jennifer Petoff:

But I would say my blog over the years has evolved from being a record of our travels written for ourselves to detailed post that hopefully help others plan their trips and hopefully benefit from our experience on the road and also living in Ireland and tying into your last question about SRE principles applied to other areas. Like I've even used my blog to promote SRE best practices with a post covering a retrospective I did after a bicycle accident on a trip to Croatia. So, yeah. How do you, SRE a travel emergency essentially?

Kurt Andersen:

That sounds like a great topic for a retrospective. Yes.

Jennifer Petoff:

Indeed. We don't want to call that one a postmortem, like no one was... It wasn't that dire, but yeah. Important to see what we could learn to improve our travels in the future in that experience. You did ask my favorite destination. It's so hard to pick just one place. Yeah. Maybe I'll pick top three. How about that?

Kurt Andersen:

That works. Sure.

Jennifer Petoff:

If that's okay. That works. That works. I would say Tasmania, was super, super awesome. Just the unique wildlife and just exploring the area around Hobart. I was able to do that on a trip to Australia at one point. Ireland's pretty awesome. I mean, since I'm not from here originally, it's now home, but I think Ireland is a great destination for exploration. Just the fact that there's things that are... We routinely run into things that are over a thousand years old and just amazing, amazing to see there. And then the final destination that I would throw into the mix, Sicily just for the amazing, amazing food, and obviously the culture and history and stuff too, but largely for the food.

Kurt Andersen:

Do you have a particular dish in mind or is it just their whole cuisine?

Jennifer Petoff:

Well, I think the cuisine in general, but I think what really caught my attention, I would say, I remember being in Palermo and we ordered some gelato. They have some of the best gelato in the world, and basically they put gelato in a sandwich like a brioche. So it was basically brioche three or four balls of gelato and a little cone sticking out the top. I'm like, "I can get behind this, for sure." I have a sweet tooth. I have a sweet tooth.

Kurt Andersen:

I've never seen gelato serve that way. That sounds really interesting and worth trying. So one of the things that I've noticed at conferences that we've been at is that people seem to bombard you with unicorns whenever you show up. How did that come about? That seems like a somewhat unique thing?

Jennifer Petoff:

People bombard me with unicorns? I think, yeah, for some reason, at least SREs at Google have this fascination with unicorns. And when we started the SRE EDU program, we needed a logo. We wanted some kind of catchy logo and branding for our team. So we have this unicorn with a jet pack. We joke and call it the rainbow farting unicorn, because it's got a rainbow coming out of its jet pack actually. But I think that unicorn iconography comes from the sense that like SREs can be... Like, the skillset can be hard to find. So, oh, it's like a unicorn trying to find this skillset.

Jennifer Petoff:

And then what we do as a team is we strap jet packs to those unicorns and help them to be successful. And it's just, unicorns are fun. Yeah. We have the giant blow up unicorn, inflatable unicorn on stage, and [Düsseldorf 00:29:07] at SREcon I think. And I always wear my SRE EDU unicorn t-shirt so it's good to keep things a little lighthearted and enjoyable. Keeps the job fun.

Kurt Andersen:

Got it. Okay. I did not realize that was the logo for your SRE EDU group. So that makes good sense then. As a parting question here for you, since you are an expat from the US, what advice would you give to people who are considering moving from the US to another country? Based on your experience obviously. You can't be a world expert on it, I expect.

Jennifer Petoff:

Yeah. And just for context. Yeah. So I was born in the US and I've been living in Ireland for the past 10 years. So yeah, I can speak to, yeah, what it's like to move from the US to somewhere else. And I think in some ways, like I'm a program manager, so build out your plan. It's always more complicated than you think it might be and just break it down into the detailed steps to make it a little bit easier I would say. Some things to consider if you're looking for possible destinations and such, you might want to consider the tax implications. For example, some countries have treaties with the US, others don't. As a US citizen, you always pay taxes in the US. I think cost of living is a consideration and salaries can vary compared to what you're used to in the US.

Jennifer Petoff:

But I think the most important thing, regardless of those... Some of that's minutia, right? It's all about the experience of getting a different cultural experience, getting a different perspective on things. So I think just being open-minded, there can be subtle differences. I thought, when I moved to Ireland, "How different can it be? They speak English and it's great and stuff," but there's a lot of subtle differences and stylistic differences. And I think it's important as someone who's coming into that culture to be respectful and to be adaptable. And it's up to you to adapt rather than forcing people to adapt to your style coming in from abroad. So be open-minded and be willing to adapt.

Kurt Andersen:

Okay. Okay. Yeah. Makes sense. And thanks for the advice. How long a process did it take you when you started to think about moving to when you actually landed and got settled? Do you recall?

Jennifer Petoff:

That's a good question. So, I moved to Ireland originally as part of a rotation program. So nominally, I was going to spend one year in Ireland. And fun fact, I had never set foot in Ireland before taking this assignment, the one year assignment, but I thought for a year, "How bad can it be? Even if it's terrible, it's time limited." But of course I'm still here 10 years later. So clearly things worked out really well. But I would say from the time I applied to the time I actually got the offer and moved here was maybe three months. I do think it helps if you have a job and the possibility for a company to sponsor you, because there can be different categories of visas and not having to navigate that all on your own is a plus I would say.

Kurt Andersen:

Right. Okay. Yeah. That makes great sense as well. Okay. Well thank you so much, Jennifer for joining us today on Resilience in Action. And we will have show notes attached to the blog entry for people. So make sure you check out the links that Jennifer's provided and including a link to her Sidewalk Safari, if you're interested in some interesting travel tips. I know it was great when we came to SREcon in Dublin to look at your recommendations for local things to check out. That was very helpful.

Jennifer Petoff:

Glad you got a good experience there in Dublin. And thanks for having me on the show, Kurt. It's always a pleasure. Thanks you so much.

Kurt Andersen:

Thanks Jennifer.

Pricing calculator   - Blameless Images
ROI calculator

Find out how much 
you could save

Incidents can do real damage to companies that aren't sufficiently prepared them. Use our calculator to estimate the full cost of incidents for your team.
use the calculator
collapse button - Blameless Images