Resilience in Action is a podcast about all things resilience, from SRE to software engineering, to how it affects our personal lives, and more. Resilience in Action is hosted by Blameless Staff SRE Amy Tobey. Amy has been an SRE and DevOps practitioner since before those names existed. She cares deeply about her community of SREs and wants to take what she’s learned over the 20+ years of her career to help others. In our second episode, Amy chats with Tim Banks, a technical account manager at Mission who has held the title of database engineer, DevOps engineer, SRE, American National and Pan American Brazilian Jiu-Jitsu champion, and professional chef during his career. In the podcast, Amy and Tim discuss:
See the full transcript of their conversation below, which has been lightly edited for length and clarity.
Amy Tobey: Welcome, Tim. I'm so excited to be talking to you again. Last time we talked we had a really good conversation about the crossover from SRE and infrastructure work into what you might learn being a professional chef or a martial artist, both of which you’ve been.
Tim Banks: It's good to see you again, Amy. I've been in and out of the industry for a couple of decades. During a time where I burned out and decided I wanted to follow the dream, I became a professional chef, going from line cook, fry person, grill cook, all the way up to running my own kitchen, opening up a couple of restaurants, and really seeing both the practical side of everything from cutting veg, to ordering and inventory, to managing people and allocating resources.It's really interesting to see some of the concerns behind putting food on a plate. It goes all the way from “Can you scrape this stuff off the pots?” to “We need to go work on a kitchen design so it's not too hot for the people sitting directly in front of the kitchen.”
Amy Tobey: A lot of our listeners probably haven't worked in a kitchen, or did as in at entry level, so could you relate it to something that our peers might know a little bit about? There's so much that we don't think about as diners.
Tim Banks: At peak, you're talking about a weekend night and you've got every table, every seat in the restaurant full. You've got 30 or 40 tickets in, you've got every burner filled, you've got every position on the fry field full, you've got plates lined up, you've got servers coming in saying that they need this and that. You've got the restaurant owner in there saying, “This restaurant reviewer's coming in and we need to get them taken care of.” You've got your dishwasher who's telling you that the dishwasher is broken and he's having to do stuff by hand. Your sous chef is yelling at the line cook because he's doing something wrong. Your salad person is letting you know you're out of this type of field greens and it's on the menu, and you're sitting there trying to make sure your sauce doesn't break.
You've got 30 or 40 tickets in, you've got every burner filled, you've got every position on the fry field full, you've got plates lined up, you've got servers coming in saying that they need this and that.
Amy Tobey: It sounds just like working at a startup. It reminds me of my job right now where I have to think about what other people are doing and try to bring it together for a finished product. One thing you said that really caught my attention was, there's the tickets that are coming in, and then there's all the stuff that comes in async outside of that system.
Tim Banks: When you work in a kitchen, there's always going to be that server who forgot to tell you this order's supposed to be like this, or forgets to get the bread. So what'll happen is you'll be 85% of the way to getting a table out and the server will say, "Oh, I forgot to tell you that so and so has a grain allergy." That was important information to know beforehand, so now what we'll do is re-fire a position on the table and re-cook it meeting the new parameters, and then try to coordinate the rest of the food because nobody likes food to come out asynchronously. No one likes lukewarm, room-temperature food, so you have to try to get those dishes back in. It's like, “Now what do we do with these artifacts we were previously cooking, do we try and keep them hot?” If it's a medium rare piece of meat, you can't keep it hot because it's going to keep cooking. You have all these considerations because the main concern for the restaurant owner is the margin.
Amy Tobey: You're having to do a lot of adaptation in the mix there, making decisions about not just the one customer, but being able to solve for how that cascades down the line for other customers' experience.
Tim Banks: And the concern that you have as the sous chef, executive chef, chef de cuisine, or higher up is that you have to take into consideration the priorities of everyone down the line. If I'm on fry and usually fry's just a matter of putting stuff in the basket, dumping it down, and then pulling it back up, dressing it, and so forth, it's usually not overly difficult as far as all the other stations go, but you still have to manage it. You have to know that if I put this pan full of fries in the oil, my oil temperature is going to drop about 10 to 15 degrees. The lower the oil temperature drops, the longer it takes to cook the food and the worse it's going to come out. You have capacity constraints.I can only put X-amount of food in the fryer at a time before the quality of food suffers because the oil gets cold. And after I take something out, now I have to let the oil come back to temp because you actually want it to be hotter than what you're cooking it at.
Amy Tobey: And this is all while, especially on the fry station, dealing with the incredibly dangerous vat of boiling oil.
Tim Banks: Not only do you have the incredibly hot vat of boiling oil, you're next to the saute station or the grill which also have very hot surfaces. If anything icy or watery drops into the oil, you have yourself a mini explosion. You have all that to deal with in a small kitchen where people are moving back and forth and around each other with knives, hot pans, or fragile things. Everything in the kitchen is predicated on preparation and communication. You can have either, but if you don't have both, everything's going to fall apart.
Everything in the kitchen is predicated on preparation and communication. You can have either, but if you don't have both, everything's going to fall apart.
If you don't have your station set up, if you haven't chopped all the veg you need, if you haven't put everything within reach or on the lowboy, if you don't have your towels ready, if you don't have your salt, pepper, and bowl set up in advance, you're going to get into the weeds, and when you get in the weeds it's really hard to get back out.
Amy Tobey: That sounds a lot like software development.
Tim Banks: Your tools have to be set up in a way that works for you. You'll see some very rigid chefs that will try to do this and dissuade people that can't conform. But the good chefs know that everyone's mise en place is different. All that matters is that if someone else is coming to work at that station just for a little bit, they at least have to be able to see where everything is. It only matters how well it's documented.
But the good chefs know that everyone's mise en place is different. All that matters is that if someone else is coming to work at that station just for a little bit, they at least have to be able to see where everything is. It only matters how well it's documented.
It doesn't have to be the same way every time. I'm left handed. As a left hander, my station setup is mirrored. But if someone comes over to work at my station for a second, it's still well arranged. If they need to know where something is, I have a little card in front of my lowboy that says where everything is and indicates vaguely how much we have left. Once we're halfway out of something, I'll mark a line through it as half and then once it's all the way gone, I put a second next to it.
Amy Tobey: There are a lot of elements there for a tight team moving at incredible velocity. But the part that I find extra fascinating is, in software, we're often moving at a glacial pace by comparison. Our sprints are two weeks long and it sounds like a modern kitchen might do a sprint worth or work every hour.
Tim Banks: Usually there’s a lull right when the doors open, then you have your rush. Rush is usually going to be 1-3 hours, depending on the restaurant. Some places may have a late rush but that prime time, the big part of the curve, is uninterrupted. It gets worse and worse and worse until it finally gets better.
Amy Tobey: Then the uninterrupted part is a result of advance planning and preparation.
Tim Banks: Exactly. And communication.
Amy Tobey: And communication. When I think of chefs, most of my knowledge comes from watching terrible TV shows, or sometimes great TV shows that feature terrible chefs. So can you talk a little bit about the cultures that evolve and maybe a good culture versus a bad culture?
Tim Banks: The traditional chef environment that people think of for fine dining is a French one called the brigade system. This involves a very regimented, militaristic, hierarchical order where the chef de cuisine will communicate with the executive chef. Everyone has specifically defined roles and the communication usually flows only one way unless you're requesting a status on something.In a typical brigade system, your sous chef will say something like, "Where am I at on table 14?" and everyone will look up and respond. That's how it operated for a very long time, and if you weren't good in that regimented work environment, you didn't do well in the fine dining restaurants.It's hard to say when exactly it started to change, but you can say the '80s, '90s, or 2000s. I think the biggest change happened during this last economic turndown in the early 00s during the post-9/11 recession where you saw a lot of Michelin star chefs opening up food carts or small restaurants because it's all they could afford to do—you're talking 10 people in the restaurant at most and a good bar. They were doing very much more relaxed standards and plating. It was much less artsy fartsy, for lack of a better term.
Amy Tobey: So you're saying that the kitchen experienced that renaissance as much as those of us who enjoyed this explosion of great restaurants showing up.
Tim Banks: It was a democratization of the kitchen. A lot of brilliant, talented chefs that did not do well in the brigade system were able to thrive. You saw a lot more people of color, you saw a lot more women. You saw a lot more LGBTQ people being able to actually work in these environments because what mattered was, can you cook the food well, do you have skills, do you have talent, can you come up with new plates. It didn't matter how white you kept your whites or whether your towels were always folded into thirds. I think we see that a lot in software development where it doesn't matter if you wear a Polo shirt and khakis, or if you always use this development environment. As the larger environment became more inclusive, you saw different types of people developing in different ways. The types of things that are being written, the services being offered are way better than they ever have been. We've got a long way to go, don't get me wrong. So when you go out to the restaurant and you see a high-end food truck or someone's little 15 seat pizza shop that's super, super good, and you find out this person went to the Cordon Bleu or did a summer at the French Laundromat, that’s because more and more people can get that experience without working in a regimented environment.
I think we see that a lot in software development where it doesn't matter if you wear a Polo shirt and khakis, or if you always use this development environment. As the larger environment became more inclusive, you saw different types of people developing in different ways. The types of things that are being written, the services being offered are way better than they ever have been.
Amy Tobey: You've had experience with another regimented environment earlier in your life. Did you feel like that prepared you at all?
Tim Banks: I joined the Marine Corps right out of high school, and that prepared me for quite a bit. There is a certain amount of healthy and some unhealthy disdain that folks who enlist out of high school have for folks that went to college. My enlistment in the Marine Corps was avionics and navigation. In order to be able to do that you had to start with the basics of electrical theory with voltage and resistors, all the way to circuit theory: making transistors by hand, soldering a micro miniature, making radars from a box full of parts.But it's interesting because you learn this by having a crusty old gunnery sergeant yelling at you inches from your face on how to count in binary hexadecimal.
Amy Tobey: That seems like a terrifying way to learn that.
Tim Banks: It is, in fact, a terrifying way to learn. But if you are the type of person that can learn in that environment, you'll do really, really well. The military is not for everybody; it's for a very select few people who have the aptitude to do the work and are receptive to learning like that. I was one of those few. I did a semester or two at college. It's just not my game. If you're going to sit there and draw on a chalkboard and just talk to me, I'm going to tune out. I'm going to go somewhere else.
Amy Tobey: So that translation from abstract ideas to practices within a classroom environment didn't work as well for you, but being in the thick of it, even though somebody's screaming at you, was how you learned and thrived.
Tim Banks: Absolutely. The military distills out a lot of stuff that's not relevant to doing the job in the technical schools. Now don't get me wrong, it's not essential when I'm trying to solder for me to know the muzzle velocity of an M16A2 service rifle. But you still know it because that's part of your job. It allows you to multi-thread well. If I'm down there soldering something and someone asks me a question about a completely different type of gear, I can have a conversation with them about that type of gear and then respond to someone asking me what year the Marine Corps was founded. And I'll be like, “1775 by Samuel Chapman in Tun Tavern in Philadelphia.”
Amy Tobey: Oh, wow. I like to say when you're 20, you're basically Wolverine. You can get pounded into the ground and a few minutes later you're probably going to bounce back. And then as we age and get into our 30s, it gets a little harder to adapt and to bounce back from things, especially those stressful situations.
Tim Banks: What I've found is that I can still party like a rock star, I just can't recover like a rock star. I can keep my nose to the grindstone, but when I finally close the laptop, I'm done. I'm not opening it up anymore. I just don't have the energy for it.
What I've found is that I can still party like a rock star, I just can't recover like a rock star. I can keep my nose to the grindstone, but when I finally close the laptop, I'm done. I'm not opening it up anymore. I just don't have the energy for it.
Amy Tobey: But these days when you close your laptop, you often go to the Dojo.
Tim Banks: The one consistent constant in my life I think has been learning to embrace the suck. Being in the Marine Corps. Being in the kitchen when everything is sideways, it's 140 degrees, and you're dripping sweat, sucks. In Brazilian jiu-jitsu when there's a 350 lb person on your stomach trying to choke you to death, it's not great. It kind of sucks.
The one consistent constant in my life I think has been learning to embrace the suck.
But even in that situation, you're solving for a lot of problems. What's more important? I know this knee on my belly is very uncomfortable and this person is very, very heavy, but if I let him drop his elbow another two inches, I'm going to pass out. So I have to deal with the elbow. Now I have to deal with the hand. Now I have to try and move my hips so his knee moves off my belly just a little bit. I don't have to get it off all the way, I don't have to break his grip on my collar, I just have to do enough that I can breathe a little bit. Now that I've handled those immediate problems, I can solve other problems.
Amy Tobey: That reminds me of my favorite phrase that I use all the time: “We don't have to fix it, we just have to make it suck less.”
Tim Banks: It’s the MVP of life, right? I don't have to win, but I don't want to pass out right here on the mat in front of my family.
Amy Tobey: That's a little extra drive, having the family there.
Tim Banks: Oh, yeah, it's fun. I've won a lot, but I've lost some too. My kids are all into it, my wife competes as well. It's interesting when the little kids don't quite understand they’ll be like, "Oh, Daddy. We love you," versus my nine year old who's a world champion in Gi and No-Gi in Brazilian jiu-jitsu herself. She’s like, "You really probably should have shrunk your hips out a little more,” and I'm like, "Yes, I know what I should have done."She's the only world champion in the family and she's normally pretty humble about it, but every now and then she'll remind myself and my wife that she is in fact the only world champion. She has a gift for it beyond the work she puts in for. She just has a natural adeptness, which is good for me because I love it. I train when the gym is open two to three times a day, every day.
Amy Tobey: You mentioned that one of the ways that Brazilian jiu-jitsu differs from some of the more I guess mainstream martial arts is the focus on kata, or being in the moment. That fascinated me because very often as systems people working on production systems, we have to make those choices, too. Do we go after the theoretical right answer or do we go after the immediate fix?
Tim Banks: In jiu-jitsu, there are techniques. You can do a very, very, very basic approximation of some of the techniques by yourself. But they have no context. And in Brazilian jiu-jitsu, context is everything. You could practice a kimura sweep by yourself, but it means nothing unless you're practicing with someone and know where you should have your feet in relation to your rear end when you lift up, where their weight has to carry on your hips when you lift them, and how close you have to have your elbow to their elbow in order to be able to sweep them over. I could tell you the form and you could simulate one by yourself, but if you have never actually practiced it with a person, you won't be able to do it.
Amy Tobey: Because it's fundamentally an adaptive procedure, right? Even if you sparred with the same person every time, each time you do it it's going to be a little bit different.
Tim Banks: If I'm going to kimura sweep someone who's shorter than me, I'm going to use different variations of the technique than I'm going to do for someone who's taller than me, or lighter than me, or heavier than me, or who carries their weight in their chest versus their hips. You can't say, “I'm going to do this every time” because it doesn't work. I think this is important for us because you have to be constantly adaptable. You have to understand that you have a technique or method, but it's never going to hit like that. When you can watch a bunch of jiu-jitsu videos and then get on the mat with somebody who's been practicing jiu-jitsu, they're going to kill you.I am somewhat okay at jiu-jitsu because the people that I train with are good. Not only are they good, but they're good training partners. The highest level black belts in the world of jiu-jitsu aren’t necessarily the best teachers or even the best referees.
Amy Tobey: We know that from our professional work too. The 10X engineer is off creating technical debt at a 10X rate, but probably isn't pulling people up behind them.
We know that from our professional work too. The 10X engineer is off creating technical debt at a 10X rate, but probably isn't pulling people up behind them.
Tim Banks: You see people that are absolutely brilliant and you just think, “Wow, I can't see how they're doing that.” But they can't explain it to you, either. There are people that can do and do, but have a really hard time trying to explain it. That's fine if you're doing something competitive at an individual level.
Amy Tobey: I bet everybody has something like that. It feels like Brazilian jiu-jitsu is that way. I'm thinking of passwords. My son will ask me for passwords and I don't know what they are, I just type them.
Tim Banks: I remember when I used to have to remember 40 or 50 passwords. Now all your passwords are a variation of one thing and if you guess the algorithm, then you can pretty much unlock anything. So thank goodness for password managers.
Amy Tobey: The last topic I really wanted to bring up is COVID-19, the lockdown and the economic impact. I feel like you're in a position to see a lot of the outcomes and decisions people are having to make under extreme pressure with reduced resources and no playbook. What are you seeing out there in the field?
Tim Banks: There are a few themes that I'm seeing. There is the theme of the company that is economically impacted because their core business is no longer there (catering to parties, or gatherings). Their customer base is drying up and they have not architected themselves to be resilient to great fluctuations in demand.This is a scale-down problem. Now, there are others that are seeing impact to their customer base, seeing demand dry up, and have architected themselves to allow them to be able to scale down. There are also companies that are seeing a great uptick in usage and demand because people are home now playing more games or ordering more food, and they have not prepared themselves to be able to scale up at a very high rate of speed. And then there are the ones that are seeing the large demand and have architected themselves to be able to scale up very well, very quickly.So you're seeing scale-up or scale-down and are-prepared or aren't-prepared. And the ones that have prepared themselves were talking about this months ago or even years ago. They said “Well, I don't want to get into a spot where I can't scale up or scale down as needed. What do you think we should do about this?" or "Are we going to paint ourselves into a corner if we do this," or "We have painted ourselves into a corner but we're not too big yet, so let's go ahead and reset the bone before it starts to heal."
So you're seeing scale-up or scale-down and are-prepared or aren't-prepared. And the ones that have prepared themselves were talking about this months ago or even years ago.
Amy Tobey: In those places that are having success, have you noticed a common thread between them that's enabling that ability to move forward?
Tim Banks: Ego. We’ve all worked with either this engineer I'm talking about; the ones that already know everything and they don't need your help. As a rule, these are the ones who have painted themselves into a corner.
We’ve all worked with either this engineer I'm talking about; the ones that already know everything and they don't need your help. As a rule, these are the ones who have painted themselves into a corner.
The engineers who are free, open, and transparent about what they know and what they don't know and are willing to accept input and suggestions are typically better prepared. And the difference that I've found is ego. I've been that engineer that's like, “I don't need to know about that, or I know this.” Then you shoot yourself in the foot when you're answering a pager at 3:00 AM every night for two weeks straight.These people are now asking for help, and it's like, “We were trying to offer you some help before, but let's go in and see what we can do.” Then you pull back the covers and realize, “Oh my, this is going to hurt.”
Amy Tobey: Are you finding that the folks in the corner are becoming more receptive or hardening up?
Tim Banks: Desperation does weird things. It's almost like dealing with loss. There are some folks who understand that it has to be done, but hold on to the biases of “We've always done it this way,” or “Are you sure you know? Our person here knew what they were talking about, so why would they do it this way.” I've been there before, but it's always a difficult conversation. All I can do is say, “This is the best practice. This is the way you've done it. The more those two overlap the better off you're going to be.”
All I can do is say, “This is the best practice. This is the way you've done it. The more those two overlap the better off you're going to be.”
Amy Tobey: Then building that path requires getting the ego out of the way.
Tim Banks: Absolutely it has to get the ego out of the way. And the hard part is after you're already under duress, after you're already under load. It’s like holding a heavy weight over my head and realizing I need to tie my shoe versus tying my shoe before I pick it up. That's where we're at.
Amy Tobey: In terms of tying shoes, especially in the cloud architecture, what are the big three or four things that people can do now to prepare themselves for the next few months?
Tim Banks: The big thing is making sure that you have not been spinning up your resources onesie-twosie, in the console, or just through the CLI, and making sure things are automated. If things are automated and documented, then if you have to turn them off, you can turn them back on again in the exact same way. People haven't been testing their backups and we've been screaming about this ever since the first backup tape ever existed. So you're hearing from people, “Well, we could turn this down, but I just don't know if it's going to work when we turn it back on,” and the time to test that was before.
The big thing is making sure that you have not been spinning up your resources onesie-twosie, in the console, or just through the CLI, and making sure things are automated. If things are automated and documented, then if you have to turn them off, you can turn them back on again in the exact same way.
Amy Tobey: That's the whole macro issue, right? We're shutting things down, but they've not been shut down or not been shut down this long before.
Tim Banks: The other thing is people who don't understand yet how much downtime costs them. Folks that are having problems scaling up or had services fail because they weren't able to scale up don't know how long they should have been down for, or how long it costs them to be down. So they're making a business decision to spend the money on fixing technical debt or just going with it, but you can't make that as a data driven decision if you don't know how much your downtime costs.People can always get by as long as everything's okay. But when it's not okay in times like these, now you have to find out in production how much downtime costs.
Amy Tobey: So you've got a customer or two that are realizing this and actually learning that cost the hard way?
Tim Banks: The hard way. These are all problems you fix on the whiteboard at the beginning. At an architectural level, people haven't thought about scaling individual tiers in their services independent of one another, even within micro services. We're finding that folks will have a very large Kafka tier; it's handling all this stuff, it's great ... But now we've turned all these other systems down and we have all these instances that are running Kafka now just sitting there idle and costing us money. But we don't know how to properly spin those down.
Amy Tobey: That's a tough one, especially when they're mission critical.
Tim Banks: It's the Zookeeper problem. If there's any problem, Zookeeper's going to have it.
Amy Tobey: It's a distributed single point of failure.
Tim Banks: Several single points of failure all wrapped up in one, and it's almost comical. That's the problem that folks are having; they built in a very large single point of failure that they can't really scale up or down.
That's the problem that folks are having; they built in a very large single point of failure that they can't really scale up or down.
Amy Tobey: It’s tough to differentiate, especially in databases because of the inertia of the data. Some of these systems are really fussy about doing mass node replacements.
Tim Banks: I think a lot of the problems we've been able to solve have been low hanging fruit: spin down your dev systems, spin down your UAT, spin down your staging. Turn it up when you need it. A lot of folks have not been running their entire businesses off on demand spend.
Amy Tobey: That’s money sitting there.
Tim Banks: A lot of money sitting there. All you’ve got to do is do a no-upfront reservation. It's not going to cost you anything, but you're going to save yourself a significant amount of money right off the bat. So please look at how much reservations are, please look at running savings plans if you need to, leverage Spot for your instances or leverage Spot instances all together. Anything that doesn't have to be run in an instance probably shouldn't be running in an instance. There are a lot of people running containers as instances right now, which is comical. But, and I understand there are some technical reasons; obviously not everything fits in the container, but there's a lot of stuff that does.
Amy Tobey: You get a lot more node density and a little bit more efficiency.
Tim Banks: Much more efficiency, much more node density, and the thing that I like about it is a lot of times they're node-agnostic. Folks will have these gigantic systems to run a bunch of docking containers but those docking containers could run on a T2, T3, or some other very inexpensive system because it's not doing a lot. It's doing a central function but it doesn't require a ton of CPU memory or disc, but because these folks have all these big systems, that's what they're running with.
Amy Tobey: When times are good, yeah. You just give it the extra cores and it's less trouble.
Tim Banks: Though when times aren't good, you need to be able to shift, you need to put it on whatever it needs to run on, what the MVP should be. The bare minimum should be the minimum amount of CPU it takes to run it.
Amy Tobey: That's a great place to tie off on because I feel like some easy wins would be a nice place to leave everybody on.
Tim Banks: When people start looking for and finding the little hanging fruit, the little easy wins and quick returns, it helps build momentum, and in a time like now, I think that's what we need. People need to have something to drive them through, something that's going to see some returns and then they can do the harder, more grueling work after that. But they can at least do it with some momentum.
When people start looking for and finding the little hanging fruit, the little easy wins and quick returns, it helps build momentum, and in a time like now, I think that's what we need. People need to have something to drive them through, something that's going to see some returns and then they can do the harder, more grueling work after that. But they can at least do it with some momentum.
If you missed Resilience in Action E:1 with Lorin Hochstein, you can listen to it here.