
Webinar - Blameless + Opsgenie Integration Tutorial
Take the "work" out of your incident workflow: Integrating Blameless with Opsgenie
Take the "work" out of your incident workflow: Integrating Blameless with Opsgenie
Webinar - Blameless + Opsgenie Integration Tutorial
Integrating Blameless with Opsgenie
.png)
Description
Speakers

Aaron Lober

Aaron Lober

Paul Chu

Paul Chu

Nicolas Philip

Nicolas Philip
Video Description
Table of Contents
Video Contents
Video Transcript
Aaron Lober (00:00):
Welcome everybody. Thanks so much for joining us today. My name's Aaron Lober. I'm the head of product marketing here at Blameless. Today we're spending about a half hour together and at the end of our time, I hope everyone here will both understand a little more about who we are at Blameless and what we do and very pointedly understand a little bit more about why our integration into Opsgenie is going to be a big deal for your incident management response process or sorry, incident response process I should say. Before I go into asking Nicholas and Paul to introduce themselves, it's a pretty intimate group here, Chuck, Chisel, and Erica, I was wondering if I could just see a show of hands who is currently using Blameless today and is familiar with the way that our incident response process works?
(00:58):
I got a thumb up from Chisel, so Chuck and Erica like this opening I think will be particularly relevant to you guys, help contextualize the wider story here. Before I do that though, I do want to open the floor to my partners in crime, both Nicholas Felipe and Paul Chu. So Nicholas, if you want to just say hello and maybe give the quick 30 second background on your role in the project here.
Nicolas Philip (01:27):
Good morning, good afternoon, good evening everybody. I'm Nicholas Phillip, nice to meet you all. I've been at Blameless for two years as director of product management and very excited about this new announcement with Opsgenie, which I've been working on the past few months. So looking forward to this session. Back to you, Paul.
Paul Chu (01:53):
Yep. Hi everyone. Nice to meet y'all. Paul Chu, been here for about three years now. I head up the customer success team as well as the sales engineering team, so both pre and post sales, but I'll be running the demo portion for today's webinar as well.
Aaron Lober (02:10):
Yeah, thank you Paul. We're, we're going to work together on this. I'm really just going to do some very light table setting and then hand to Paul for the lions share of our time to show y'all how the integration's going to work and also give sort of a high level run through of Blameless. One thing I want to say, just to get ahead of a comment that I anticipate, the call is being recorded and we're going to share the recording in post, so if you want to reference anything in here, you'll have an opportunity to do that.
(02:44):
And so let's talk a little bit about who we are. For anyone who isn't deeply familiar with us already, Blameless is really a reliability solution that DevOps and SRE teams can use to propel them forward on their journey towards fully embracing SRE practices, developing reliable products, creating happy customers. And in that context, a lot really goes into delivering reliable software. No aspect of it, more important than incident management itself, core incident management. So, Paul, if you want to drive forward very pointedly, we initially built or we first built Blameless to make managing incidents easier and to ultimately reduce the impact of outages when they do occur. So Blameless is an incident management solution that's built to carry your team through a very simple flexible playbook from start to finish and allowing them to manage most of that process, most of that workflow within their messaging app of choice, be that Slack or MS Teams.
(04:05):
And we automate really the most toilsome aspects of that process, the early incident management process, and also a lot of the communication steps within the workflow. And then we provide guardrails for the rest of that that we don't automate. And we've augmented this solution with more proactive tools for incident management like SLIs and SLOs and tools for incident learning like retrospectives and incident analytics. And ultimately having brought all of that together on one platform, we're really helping your team command, communicate, resolve, and learn and grow from every single incident. So with that said, however, another important aspect of this is really in an attempt to maximize the effectiveness of this type of tool. We've also connected Blameless to most of the major players in the engineering ecosystem. So monitoring tools, observability, paging alerting on call management, ticketing, status pages, the list is long, frankly. But all of this is connected into our incident management platform and that really brings us to Opsgenie itself. Now I'm going to hand over to Paul here to really set the stage on the Opsgenie integration and then show you how it works.
Paul Chu (05:38):
Awesome, thank you Aaron. So just to start off here, when things break, your incidents do pop up, the time it takes for you to mobilize and get folks onto the call or into the Slack channel, that response time matters. Pretty obvious statement but worth kind of repeating there. And for today, our focus will be around how Opsgenie can be leveraged with Blameless in order to help minimize that response time or the mobilization time for these different issues. So just to kind of go through some of the highlights of our enhancements here. What we've done here is that with our Opsgenie integration, we've essentially whittled it down to these three points. So the first would be around automating the creation of option alerts from Blameless incidents. Sothis could be when incidents starts in Blameless, it could be something where if you already have your Opsgenie implementation configured to fire off alerts, which I'm sure many of you do, that can also be a source where Blameless incidents are created as well.
(06:42):
So again, for those who aren't familiar with Blameless, we'll kind of show you some screens around what it looks like, but also happy to do a deeper dive at a separate time as well. Going to the second point here, when you are in a live incident and you're essentially triaging and understanding more about what some of these symptoms are telling you, once you understand a little better what the incident is dealing with, you can also trigger alerts directly into Opsgenie from the Blameless incident channel. So this is either a Slack channel or a Microsoft Teams channel. So we do have some slash commands to help assist you in terms of getting to the right team or the right service via Opsgenie as well. And then along the way, whenever Opsgenie is involved with these different activities within the incident, we will auto capture what we call timeline events and have those displayed within the incident record so that way you have an audit trail of when someone had actually paged out to a particular team or service as a part of the overall incident record, which we'll show you today.
(07:51):
So with that, let me go ahead and jump into the demo here. I'm going to go ahead and start in Slack since that's what we have here for our demo. Again, we're not going to go too deep into what a Blameless demo looks like, but again, happy to go through a more in-depth demo if you'd like in a separate time. So I'm going to go and just start an incident. We'll go and run a slash Blameless command. This will then give us a modal here. We'll go and just fill out the appropriate fields. We'll go and say, define what type this is. In this case I'll just pick the default incident type. This is essentially going to dictate kind of how the coordination looks like within Blameless. We'll go and select the severity level. And then you'll notice here that we have an Opsgenie field.
(08:33):
So this is going to be directly pulled from your list of services that you have in Opsgenie. So in this case, this is a type ahead. If I were to go and start typing in a value, it'll go and give me the relevant services that match whatever I've typed here, I can simply just go and select either one or multiple. So you do have that option just based off of what you know. Again, this is the context of, "Hey, we're starting an incident and we know which service that we'd want to ping already." So you do have that option. If not, you'll notice it is optional, so you can keep blank as well. And then we'll just go against the title. We'll say a user's unable to log in and provide additional descriptions. And then you would send this and start the incident. So I'm going to ahead and close this out since I already have one created here.
(09:19):
And we'll go and jump into this incident channel. And you'll notice that Blameless will do its kind of standard things where we'll do things like auto inviting folks into the channel. We'll go and provide you with clear tasks and roles and responsibilities as well. But, from here, let's say that we were to go ahead and understand, we're understanding more about the incident. The command to run an ad hoc option alert would simply be slash Blameless trigger alert, which will then be presented with another modal. And this is where we can go and select the service as well. So again, as you understand more about the incident, you go and pick a service. In this case we'll just type ahead. Again, same thing, give us those values there, but also you have the ability to page out to Teams directly as well. So depending on how you've implemented Opsgenie, we've made it flexible enough for folks to be able to pick between one or the other, or both if you'd like. But this way you can go ahead and make sure if the page is going to the right individual.
(10:24):
So from here, once you go and trigger the alert, again, kind of like a cooking show, we're fast forwarding here a little bit. You'll notice that whenever those pages do go out and I'm in Opsgenie now, one of the things that we'll do to help get folks into the right location, and in this case the Blameless Slack channel, is that if we jump into one of these alerts here, you'll notice that we'll go and provide you with both the Blameless URL for the incident itself as well as the incident Slack channel.
(10:53):
So that way, in terms of the mobilization, the kind of recruiting of the right folks, we'll provide them with those breadcrumbs to get them to the right space to make sure that we can minimize that time to respond as much as we can. And also, like I was mentioning before, as you go through the incident, we'll leave you behind these different events within our incident timeline, giving you that audit trail of what you can and can't do or what you can track within the incident and ultimately leading into that learning aspect which will be covered within the Blameless retrospective.
(11:33):
So just to recap real quick, the number of ways that we can go ahead and integrate with Opsgenie is, again, if we go into Slack, we'll have a number of Slack commands that you could run to make sure that we incorporate Opsgenie into the workflow. If you want to have Opsgenie alerts automatically trigger Blameless incidents, you do have a web hook that's available as well, so that way you can configure within Opsgenie. And then, of course, on incident start, you can go and select a service as well. So that way if you do know which service is impacted by this incident at the very beginning, you can go and select those as you're creating the incident to make sure that the right folks are involved.
(12:21):
Open up the chat real quick. All right, so again, our focus here was just to highlight the exciting enhancement that we've made with Opsgenie. Again, we're happy to walk through a more in-depth demo with you one-on-one with your team if you'd like as well. But let me go back into the slides here and we'll cover a couple more and then we'll kind of open it up for Q and A. So again, just to recap, if we go back to that graph that we just looked at there, a number of different phases with the incident. The incident starts, you want to assemble folks, coordinate, communicate, learn, but really where Opsgenie will come into this workflow will be at that assembly point, right?
(13:07):
So it could be something where we can tie those two solutions together and make it so it's more integrated into your workflow today. You want to reduce the number of screens that folks are going to when it comes to dealing with incidents, keep them in a single place of collaboration, which in this case would be Slack or Microsoft Teams. And that way, Blameless will be able to help with the rest of the items that you see here in this graph.
(13:40):
Great. That was very quick, but again, if you do have any questions, we have built in some time here to open it up for Q and A. We also have a free trial that you can also access as well. We have, you can go to our website, you can also scan this QR code, but we're more than happy to set up an individual call as well just to make sure that if you do have any kind of pain points that are looking to be improved upon right away or this quarter, this year, we're happy to kind of walk through and see how Blameless could be a potential fit for you and your team.
Aaron Lober (14:17):
Yeah, excellent. Thank you Paul. I'll double down on that point to say for anyone who's not yet utilizing Blameless, signing up through our trial flow is a great way to get hands-on tool really quickly. We can get you spun up really so as quickly as the next hour or so, and it'll give you an opportunity to mess around with the tool, also explore what our integration with with Slack or MS Teams or Opsgenie or Pager Duty, if that is more your proclivity or some of the downstream tools like Jira and others.
(14:59):
So, it's a great way to get exposure into what we're doing. And of course, myself, Paul, our sales team, are all here poised to help guide you through that exploration. So why don't we leave this up in case anyone wants to use it, but now also open up for Q and A. If there are specific pointed questions, if you want us to dive back into the tool, I think we can do that as well. But let's start there. Also feel free to come off mute if you like, or if you prefer to drop a question in the chat, you can do that as well.
Speaker 4 (15:51):
No, I'm good. Thank you. This was great, great overview. I actually have a call scheduled with a sales rep later today so I can drill into things then.
Aaron Lober (16:02):
Fantastic. Well thank you so much, Chuck. I really appreciate it.
Speaker 4 (16:06):
Sure, yeah, thank you all very much.
Aaron Lober (16:09):
And Chisel, thank you for the correction earlier, I apologize.
Speaker 5 (16:18):
No problem. So it's a normal word, but it confuses people when they see it as a name. It's really strange.
Aaron Lober (16:26):
It's just one, I hate to get it wrong, but thank you for being gentle with me. Well, if there are no other questions, why don't we leave it there. I'll just again say thank you to everybody for logging in. I hope this was valuable. And I'll reiterate one more time, if you're not currently using the tool and you want to get signed up for a trial, you can use this access pathway. Or if you prefer to have a conversation with a sales rep, we've got a link to a demo request in the chat as well. So with that, Paul, Nicholas, Sarah, thank you guys and everybody have a great day.
Speaker 5 (17:10):
Thank you.