Creating Actionable Alerts is a continuous process that can enhance your workflows so that not only are the correct people notified at the right time, but they can take immediate action to reduce potential business-impact. This post is the first in a three-part series about alert enrichment. Without actionable alerts, your responders may be alerted to an issue, but cannot necessarily take immediate action, which has the potential to increase downtime and slow down the remediation process. Actionable alerts set your responders up for success from the start of an Incident, and empowers them to immediately start repairing damaged services. There are many ways to create Actionable Alerts, so as an introduction to our newest White Paper, Creating Actionable Alerts to Maximize Resolution Speed, we want to share our first method.
Being on-call can be a daunting and disruptive experience. Many people with on-call duties complain how having to be ready to handle incidents affects work-life balance, even health, as on-call employees may be frequently woken up in the middle of night or may need to plan evenings and weekends while considering on-call duties. As organizations enroll changes to scale on-call teams, it needs to be considered how to best match that evolution with a sustainable and humane solution. Below is some advice based on our experiences at OpsGenie so far with our customers.
For more information, download our recent White Paper: Scaling On-Call in a DevOps Organization for more information on the subject.
Follow-the-sun schedules are a way for your company to offer 24/7 global customer support and also prevent on-call burnout for your engineering/customer support teams. Having someone on-call at all times, across different time zones means that no one team has to wake up in the middle of the night to deal with an alert or customer issue. True to its name, ideally it follows the sun in that the configuration usually consists of three rotations that are staggered to cover three 8-hour shifts. However, there are multiple ways to configure a follow-the-sun schedule using OpsGenie schedules.
We love Slack like you do because it is where we get things done at work. Slack applications are the gateway for our favorite tools like Intercom, Jira, Google Drive and many more. There are also ChatOps tools like OpsGenie’s Slack application focusing on improving collaboration and automation by bringing day to day to operational challenges into shared chat channels.
DevOps is not just about developers and operations people working together or creating a culture of collaboration. It is about tightening the feedback flow. It is about working for the common good of your systems and applications. It is about learning from mistakes. These are all enabled by people and the tools that people use. Continuous delivery is a key enabler for DevOps because it helps you deploy and release your code with confidence.
StatusCake is a website uptime and performance monitoring solution. You can gain invaluable insights into your website's performance and get alerted when things aren’t right. StatusCake is capable of sending email or SMS notifications. These are great.
In 1970, a series of devastating wildfires swept across California, destroying more than 700 homes over 775 square miles in 13 days with 13 fatalities, and resulting in more than $233 million in losses (over $1 billion in today’s dollars, adjusted for inflation). Thousands of firefighters from around the state and beyond responded, but found it very difficult to work together. They certainly knew how to fight fires, but lacked a common management framework that could scale up and down with the incident. They also lacked a standardized approach for incident leadership. Shortly thereafter, several fire service leaders created a revolutionary system for managing emergencies that range from the everyday fire and medical emergency to large-scale emergency events that make the national news. The Incident Command System (ICS) was born, which has since evolved into the Incident Management System (IMS).
Technology Solutions Providers all around the world rely on ConnectWise technology stack to manage their business, sell more efficiently, automate service delivery, and remotely control technology to deliver amazing customer experiences.
OpsGenie provides unique incident response orchestration capabilities that complement ConnectWise products to prepare for incidents that have the potential to impact business. With OpsGenie’s ConnectWise certified app, you have all the data you need to analyze and resolve problems as well as the tools to develop incident response plans, collaborate and coordinate the response actions, and analyze response effectiveness.
ServiceNow is a service management platform that allows users to submit requests for technical support for hardware, software, applications, and more. Organizations around the world leverage its capabilities to consolidate systems and automate service management processes.
OpsGenie provides unique incident response orchestration and alert management capabilities that complement ServiceNow, enabling teams to prepare for and address incidents that have the potential to impact business. With OpsGenie’s ServiceNow certified app, users have all the data and tools to design actionable alerts and incidents, manage on-call schedules and escalations, and orchestrate communication and collaboration during incident resolution process.