Threat Detection and Incident Response Orchestration Systems

Feb 23, 2017 by Karine Margaryan

Recently OpsGenie hosted a joint webinar with Signal Sciences, where Berkay Mollamustafaoglu, OpsGenie Co-Founder and CEO and Zane Lackey, Founder/CSO, Signal Sciences Corp, discussed how to secure your web applications using OpsGenie and Signal Sciences. 

Watch the webinar now!

Read more »

Notify Thousands With OpsGenie!

Feb 16, 2017 by Emel Dogrusoz

We at OpsGenie continuously work hard to add new capabilities to our product.  We are proud to announce the BETA release of a new resource that will help alleviate daily stress: Mass Notifications!

Read more »

Using Logs In Incident Response

Feb 8, 2017 by Karine Margaryan

A few months back we held a webinar in conjunction with Logentries where we demonstrated “2 Use Cases for Using Logs in Incident Management.” In the webinar we examined traditional log management and IT alerting tools integrations along with the future of incident management with enriched alerts. Now if you’re already an OpsGenie or Logentries user or using monitoring and collaboration systems, like New Relic or Slack, this blog post might be of some interest to you. You will learn about improving your alert notification processes and incident response times.

Read more »

OpsGenie News - February 2017

Feb 1, 2017 by OpsGenie Team

CA Flowdock “Team Inbox” Integration

  • OpsGenie now released a new version of CA Flowdock Integration. Now you can forward OpsGenie alert activities to CA Flowdock "Team Inbox" using Flowdock (Team Inbox) Integration. You can use the Flowdock (Chat) Integration to forward the alert activities to Flowdock "Chat Window".
  • Please refer to Flowdock (Team Inbox) Integration for more details.

Heartbeat API v2 Released

  • OpsGenie released Heartbeat API v2 which is a new RESTful API implementation designed around RESTful principles.
  • Heartbeat API v1 will be deprecated by the end of June, 2017.
  • You can reach Heartbeat API v2 documentation from here.

Mattermost Integration

  • OpsGenie now has bi-directional integration with Mattermost!
  • Mattermost is an open source, private cloud Slack-alternative platform.
  • OpsGenie is an alert and notification management solution that is highly complementary to Mattermost.
  • With this integration, OpsGenie's alert notifications go right into your Mattermost channel. In turn, you can execute actions such as acknowledging, closing or assigning alerts from the channel by just typing a /genie command.
  • For more information, please visit Mattermost Integration document.

SolarWinds Web Help Desk Integration

  • SolarWinds Web Help Desk automates IT help desk and asset management operations. Also centralizes IT asset discovery and management.
  • You can easily integrate SolarWinds Web Help Desk with your OpsGenie account and get notifications from OpsGenie when a ticket is created on SolarWinds Web Help Desk.  
  • For more information please visit SolarWinds Web Help Desk Integration Document.


Read more »

AWS CodeCommit to Manage OpsGenie Configuration

Jan 30, 2017 by Ibrahim Guntas

You spend time creating your OpsGenie configuration, so it’s now crucial to ensure that your account configuration is safe and reproducible just in case something goes wrong.

Git is a distributed version control system that records changes to a file or set of files over time. This allows tracking changes to each file and recalling a specific version later. It’s primarily used by developers as a source code management system; however, it can and should be utilized in Operations, too. Git is efficient for not just code, but also for all kinds of alternative files such as configuration files. 

Earlier, OpsGenie introduced the new “OpsGenie Configuration Management Tool.” This tool uses OpsGenie Java SDK to export OpsGenie configurations as JSON files enabling customers to take advantage of Git to manage the configuration data. Got that? The tool also supports restoring configurations, which means that it not only tracks configuration changes, but also reverts to a previous configuration if needed.

OpsGenie configuration management

Read more »

OpsGenie Triples Revenue in 2016, Adds Industry Leaders as Advisors

Jan 25, 2017 by OpsGenie Team

New Customers Yahoo, HubSpot, Overstock Sign On and Fuel Company’s Momentum; AppDynamics Chairman Jyoti Bansal Joins as an Advisor, and Google Executive Izzy Azeri Joins the Board.

Falls Church, VA – January 25, 2017 –OpsGenie, a leading alerting and on-call management platform for engineering and operations teams, today announced record growth for 2016 and new strategic advisors for the company.

Read more »

Wondering how to trigger alerts from Slack messages?

Jan 24, 2017 by Serhat Can

Slack is great!

It is not just a messaging app; it is so much more with its apps. OpsGenie’s integration with Slack is so thorough that it has become a top app in the Slack app directory!

OpsGenie's Slack app has built-in slash commands, which is an efficient way to execute actions on a channel. For example, you have the option to create an OpsGenie alert in a channel by typing a command like /genie alert [alert message] for [user team]. Slash commands are very powerful and cover the majority of the use-cases. However, there may be more you want to do in your Slack channel.

Read more »

Power Your Team with Atlassian and OpsGenie!

Jan 19, 2017 by Emel Dogrusoz

As opposed to the traditional software business, the modernday software business is platform-based, which empowers integration with a widening world of products and services. Thus enabling new business ecosystems. The resulting ecosystems enable results and achievements that are way more effective and capable compared to individual product offerings. In this new software world, companies co-evolve their capabilities, cooperate to support consumer needs at every stage, eventually conceiving the next big innovations of technology.

Read more »

SolarWinds Partners with OpsGenie to Meet Customers’ Alerting and On-Call Management Needs

Jan 13, 2017 by OpsGenie Team

The partnership comes as SolarWinds discontinues SolarWinds Alert Central and recommends OpsGenie to its customers as an effective alternative

AUSTIN, Texas – January 4, 2017 –SolarWinds, a leading provider of powerful and affordable IT management software, today announced a strategic partnership with OpsGenie, a leading alerting and on-call management solution for development and operations teams. The partnership comes as SolarWinds discontinues its free alert aggregation tool, SolarWinds® Alert Central.

Read more »

SolarWinds and OpsGenie Partnership

Jan 13, 2017 by Karine Margaryan

Last week we were proud to announce a strategic partnership with Solarwinds!  As SolarWinds announced the End-of-Life of their Alert Central (AC) product, they found OpsGenie to be the best alerting and On-Call (incident) management platform on the market to recommend to their customers.OpsGenie was enthusiastically ready to make this transition as smooth as possible.

Read more »

Monitoring AWS Lambda Functions with OpsGenie

Jan 9, 2017 by Serhat Can

Monitoring is an important part of DevOps. You need to monitor; your servers, containers, and now your serverless functions. Function as a Service (FaaS), also called “Serverless” architecture, is a relatively new concept. AWS Lambda is by far the most popular FaaS/Serverless solution in the market. It is an event driven, serverless computing service.

Read more »

OpsGenie News - January 2017

Jan 1, 2017 by OpsGenie Team

Custom User Role Updates

  • Users that are not an admin or owner can now be granted with the right to update active or planned Maintenance Policies.
  • You can refer Users and Roles for further information about custom user roles.

ConnectWise Automate (LabTech) Integration

  • ConnectWise Automate, formerly LabTech,  is the industry-leading IT automation software designed to allow you to automate any IT task to improve your IT services. ConnectWise Automate allows you to discover all devices and users so they can be proactively monitored.
  • You can easily integrate ConnectWise Automate with your OpsGenie account and get notifications from OpsGenie when an alert is created on ConnectWise Automate.
  • For more information please visit ConnectWise Automate Integration document.

SolarWinds and OpsGenie Partnership to Meet Customers’ Alerting and On-Call Management Needs

The partnership comes as SolarWinds discontinues SolarWinds Alert Central and recommends OpsGenie to its customers as an effective alternative

Read more »

OpsGenie is on-call with you over the holidays.

Dec 27, 2016 by Karine Margaryan

1. Manage your alerts on the go!

Are email notifications enough to control your IT infrastructures behavior?  No, not anymore. Email notifications are no longer as effective when there are other channels of communication available; such as phone calls, SMS or iOS & Android push notifications.

Read more »

6 New Year’s Resolutions for Incident Management

Dec 21, 2016 by Emel Dogrusoz

It is that time of the year again! New hopes and motivations to improve our lives for the better! Yes, we are talking about New Year’s resolutions. :) Aside from your personal resolutions, we would like to focus on the positive changes you can make to your professional life.

Read more »

Incident Resolution via Virtual War Rooms

Dec 16, 2016 by Emel Dogrusoz

As incident arbitrator, we all know how complicated and stressful it is to work on time-critical incidents. Every extra minute your team spends on resolving an incident is valuable and may have a devastating impact on your business and customers. So what can you do to minimize the time, effort, and the stress related to major incidents?

Read more »

AWS re:Invent 2016 Customer Appreciation Event

Dec 13, 2016 by Berkay Mollamustafaoğlu

Hello, Readers!

Berkay here! Early this week, OpsGenie had the opportunity to host a very exciting, Customer Appreciation event. During the week of the AWS re:Invent conference we opted to foster relationships outside the conference. The OpsGenie team flew into Las Vegas on Monday night, and boy are our arms tired. Ok. Bad joke. :)

Anyways, the next morning we made the most of our day by beginning it with a company breakfast at the MGM Grand Hotel, where we discussed business and logistics for the event. But first, we took a walk along the strip for some fresh air and adventure. It was a beautiful day in Vegas!

Read more »

Developing Custom Solutions Using OpsGenie APIs

Dec 9, 2016 by Celal Emre Çiçek

At OpsGenie, we do our best to enhance our customer's experience using our platform. We consider their needs and develop features that will promote efficiency and growth.

We believe that our product is already feature-rich, yet we’re continuing to further develop it to promote customization and flexibility. Our team works daily to continue product enhancements; and they’re always ready to extend it with customizations, as you need.

Read more »

New OpsGenie Configuration Management Tool

Nov 23, 2016 by Karine Margaryan

Backup is key in information technology. Data loss may be irreversible causing huge inconveniences to people and companies. When you have a large number of administrators who constantly make changes to the system, the risk of making incorrect actions is high.

Have you ever lost data with an accidental click? Was your hardware corrupted or your PC infected by viruses and cyber attacks which led to data loss? Research validates that almost 50% of people do not backup their data and then uselessly tries to recover it.

Read more »

Build Your Teams and Fight Alert Fatigue!

Nov 4, 2016 by Emel Dogrusoz

In today’s world, most organizations use a team-based structure. With this, organizations strive to define responsibilities, build the right skill sets, distribute workload, and eventually maximize productivity and success.

At OpsGenie, we care about our customer’s flexibility to adapt our software to their organizational needs.

Read more »

From our ChatOps to yours

Sep 28, 2016 by Serhat Can

At OpsGenie, we do our best to be punctual and early adapters of new features released by the companies we integrate with. We want to stay flexible while providing a complete workflow and feature set.

Not so long ago, Slack introduced, “Interactive Buttons,” which OpsGenie had a great tested use case for! So, we started working on it. As an early adapter, our previous Slack Application (App) was already part of the directory; which led us to build a brand new Slack App.

Read more »

Better ChatOps With the OpsGenie Add-on For HipChat

Sep 22, 2016 by Jonard Doci

We, at OpsGenie, work to make integrating our product into the user’s existing toolset as smooth as possible. Which may seem like an unmanageable task since each customer has a unique model in conducting their daily operations. 

ChatOps: Productive Chatting

A particular model of operation that’s on the rise is ChatOps. This model revolves around team communication and aims to unify all of the user’s tools, workflows, and processes into a single chatroom. An example might include a bot that pulls your code from GitHub and deploys it straight into one of your Amazon EC2 machines. All you need to do is type the command that triggers the bot to follow the action and voilà! Everyone on your team will also be able to follow the procedure live from the chatroom. Among other benefits, this ChatOps unified model makes operations more transparent and increases team productivity.

Read more »

Smarter Incident Workflow with New Alert & Notification Policies

Aug 22, 2016 by Kadir Türker Gülsoy

For us, incident responders and managers, incident management is a complicated beast that requires an active effort to streamline an effective workflow to identify, analyze, and solve the incidents. Failing to notice a problem is intolerable for us, on the hand, we don't want too many alerts and notifications that cause alert fatigue and may lead to longer response times or to missing significant incidents. To prevent these two crucial needs from becoming a dilemma, we are excited to introduce to you a set of new features: Auto Restart and Alert Count Based Notification Policies.

Read more »

Alerts: Gotta Catch 'em All

Jul 14, 2016 by Serhat Can

If you haven’t noticed yet, Nintendo's new game, Pokémon GO, is literally taking over the world. Pokémon GO reinvents a classic game with an augmented reality twist. The game is not available worldwide yet; however, that has not prevented users from downloading and catching 'em all. According to a Forbes article by Jason Evangelho, Pokémon GO is about to surpass Twitter in daily active users on Android and is #1 on Google Play Store. Incredible success!

Read more »

On-Call Toolkit

Jul 12, 2016 by David Hiltner

What tools do you need to be an on-call warrior or hero? In this blog post we’ll examine the instruments needed to be on-call. Our goal is to provide you with on-call gear and tool ideas that help make your on-call experience successful; whether you’re an on-call novice or pro. We welcome any additional tips from our readers by responding to this post through our social media channels -- LinkedIn, Twitter, Facebook or Google+.

Read more »

OpsGenie Raises $10 MM Financing Led by Battery Ventures

Jun 30, 2016 by OpsGenie

Falls Church, VA—June 29, 2016—OpsGenie, an emerging player in the critical area of IT alerting and on-call management, has raised $10 million in Series A financing from Battery Ventures, a global investment firm. OpsGenie will use the funds to continue to tackle the biggest challenges faced by customers in providing “always-on” services, and specifically to continue investing in its product and building out its go-to-market capabilities. As part of the financing, Battery General Partner Neeraj Agrawal and Battery Vice President Paul Drews will join OpsGenie’s board.

Read more »

Troubleshooting problems from a chat room with Slack and OpsGenie

Jun 17, 2016 by Berkay Mollamustafaoğlu

With OpsGenie you have the option of executing actions directly through our OpsGenie app. We have described how this capability can be used to gather additional information and enable alert recipients to assess problems efficiently in the post “You woke me up, now what?”

Directly through slack you can (1) forward alert activity to Slack channels and (2) allow users to interact with alerts, acknowledge, comment, close, etc. Refer to our blog post titled “Bi- Directional Integration with Slack”

Read more »

Email Integration: Alerting and Incident Management Solutions

Apr 20, 2016 by Berkay Mollamustafaoğlu

In the coming weeks OpsGenie will help buyers looking for a reliable, scalable, and customizable alerting and incident management solution by assessing features, toolsets, and functionality in a comprehensive comparison between OpsGenie, Pagerduty, and VictorOps through a series of detailed blog posts. It is our goal to shed light on who does what, and the stark realities between the three popular technologies. OpsGenie will concentrate on areas within our platform that we believe are extremely important when looking for an alerting and incident management solution for the dev&ops and IT community in general. This week we will focus on Email Integration.

PART 1: Email Integration -- A comparative assessment of the direct contrasts between OpsGenie, Pagerduty, and VictorOps. OpsGenie’s email integration enables customers to integrate OpsGenie with any system that can send alerts via email. Email integration is the most commonly used integration method by our customers since it is easy to use and almost any system out there can send emails.

Read more »

With on-call alerts, sometimes it can wait..

Apr 5, 2016 by Berkay Mollamustafaoğlu

For most of us in ops, it is vital for us to get notified asap about problems that impact the services we provide. It’s often a race against time to restore the service or to prevent an outage. But not all alerts require an immediate response, some can wait. Enabling users to deal with alerts that don’t require an immediate response efficiently, is just as important in preventing alert fatigue, to ensure we can stay fresh.

At OpsGenie our mission is to empower our users to be able to handle critical as well as non critical/urgent incidents efficiently.

Read more »

Which superhero would be best at answering on-call alerts?

Mar 25, 2016 by OpsGenie Team

The OpsGenie team recently had a thorough and heated discussion (KAPOW!!) on who would be better with on-call alerts and incident management, Superman or Batman? Who would come out as winner when pitted against each other in a war of on-call alerts and response time? So, we thought we would hash it out here on our blog in a completely fictional format. We’ll try to examine each area of alerting and incident management to see who we think we would want on our on-call team. 

Read more »

What’s great about the NEW OpsGenie Apple Watch App?

Mar 11, 2016 by Nadia Mehra

As if you needed another reason to love OpsGenie and all its capabilities- We released an OpsGenie app for the hi-tech, sleek Apple Watch; where now, you can get the most out of your weekends and look stylish doing it.

Read more »

Fighting alert fatigue - mute and bulk actions

Mar 3, 2016 by Berkay Mollamustafaoğlu

Continuing with the discussion on how OpsGenie can help alleviate alert fatigue we will be examining areas where on-call employees take specific bulk actions to reduce the excessive alerts that often hinder operations.

Read more »

Fighting Alert Fatigue - Alert Deduplication -- Part 1

Feb 25, 2016 by Berkay Mollamustafaoğlu

The concept of “Alert Fatigue” is well known in industries such as healthcare, and awareness is increasing in IT operations as well. Fighting alert fatigue has been a key design objective for OpsGenie since our inception. Summarized in the earlier post, some of the key capabilities that OpsGenie provides can be used to alleviate alert fatigue. In a two part series, I go into more detail on how these features can improve the alert signal to noise ratio.

Read more »

Routing phone calls using on-call schedules - OpsGenie

Feb 8, 2016 by Berkay Mollamustafaoğlu

Since we launched the OpsGenie phone call routing feature last year, we’ve had an enormously great response from customers. So much, in fact, that we’re dusting off this blog post from last year and updating it for everyone who is not as familiar with it. Is it easy to use? Yes, it is! You see, OpsGenie routes alerts to the appropriate on-call individual using a method of policies, on-call schedules, etc.. Prior to the launch of the application last year, we heard similar questions from a number of our OpsGenie customers, such as “Can we route phone calls to the right person like we route the alerts?” This turned out to be a great question, one that resonated with many of our customers. For a product team, customer feedback like this is priceless!

Read more »

You woke me up. Now what?

Jan 29, 2016 by Berkay Mollamustafaoğlu

As an alert notification solution, our first priority is to ensure that the right person is notified when there is a problem. OpsGenie sends multiple notifications through different channels, escalates etc. to ensure that critical alerts don’t get missed. As crucial as that is, if an alert notification system just stops at “waking you up”, it becomes part of the problem rather than a solution.

Read more »

How to create a free status page using OpsGenie

Jan 4, 2016 by OpsGenie Team

Every service provider wants their services to be available 24x7x365. But outages and planned maintenance are inevitable occurrences for online software services. Dealing with outages and communicating with users during the outage is as important as the availability of the services provided. To keep users informed, many service providers use web based “status pages” that contain up to date information about the health of the services, incidents, and what the provider is doing to resolve the issues.

OpsGenie is an incident management system for Dev & Ops teams. Customers use OpsGenie to consolidate their alerts generated by monitoring systems and route them to the right people using on-call schedules and escalations. Because OpsGenie is an essential tool used during outages and we have vital information about the incidents; our customers have been inquiring if we can create “status pages” programmatically based on the alerts generated in OpsGenie.

Responding this request, we’ve taken up the challenge to provide this solution to manage status pages for OpsGenie customers.

Read more »

Being in the Driving Seat for Web Applications

Dec 18, 2015 by Kadir Türker Gülsoy

As long as our applications are in production, boosting uptime and avoiding outages is the highest priority for us developers and operational teams. Despite the great care, having 100% uptime and avoiding outages is a challenging task for even the most stringent DevOps teams. Let’s imagine that one of your data centers stops responding and in-turn your email service is completely out, or your payment service has gone offline during Black Friday. Remember the AWS outage that lasted four days and affected countless numbers of cloud services in April 2011. This is a good example that outages happen even to the most secure environments.. Now what? Are you going to examine huge log files to find out what went wrong? Are you going to notify all of your operational teams and developers at the same time to investigate the cause? Unless you allocate large resources for chaos engineering like Netflix does, you most likely will have very limited time to overcome the issue. So those aren’t realistic options for most organizations.

Read more »

SaaS integrations put traditional enterprise software to shame

Dec 3, 2015 by Berkay Mollamustafaoğlu

I’ve spent many years implementing traditional enterprise IT operations management tools. Integrations among various tools are often the Achilles’ heel of management systems. Integrating various applications is often a high-risk endeavor for customers. Enterprise vendors typically charge tens of thousands of dollars for integration “plugins”, and the implementation requires highly skilled (and expensive) engineers. To make matters worse, enterprise vendors are often not keen on collaborating with their competitors, let alone collaborating to help their customers. Vendors sometimes even block these integration efforts. I’ve witnessed a vendor not selling their product to prevent them from integrating with it (how is that for putting the customer first).

Read more »

A real-time dashboard for OpsGenie using AWS Lambda and PubNub

Oct 26, 2015 by Zafer Gençkaya

OpsGenie Webhook integration provides great flexibility to build solutions for specific requirements. In this blog post, we'll build a real-time dashboard for OpsGenie alerts. This dashboard will provide a quick overview of the most recent open alerts in OpsGenie, and when there are new activities on alerts, the dashboard will reflect these changes immediately.

The solution leverages AWS API Gateway and Lambda services as a serverless backend and PubNub for the real-time data stream. Both AWS and PubNub offer free tiers.

Read more »

Using Logstash to correlate events

Oct 13, 2015 by OpsGenie Team

OpsGenie integration family has many members and still growing. The objective of this blog post is to explain using one of the newly added ones, Logstash. Logstash is a data pipeline that helps you process logs and other event data from a variety of systems.

A Logstash pipeline in most use cases has one or more input, filter, and output plugins. Logstash has a rich collection of input, filter, codec and output plugins. A filter plugin performs intermediary processing on an event. Filters are often applied conditionally depending on the characteristics of the event. An output plugin sends event data to a particular destination. Outputs are the final stage in the event pipeline.

Read more »

Integrating OpsGenie and JIRA using AWS Lambda

Oct 7, 2015 by Berkay Mollamustafaoğlu

In the previous blog post, we've gone through how to create JIRA issues from OpsGenie alerts, using open source Marid utility. Marid approach is particularly useful when integrating OpsGenie with an on-premise, self-hosted JIRA instance since Marid initiates the connection and does not require opening up the network.

Read more »

Creating Issues from Alerts in JIRA

Sep 4, 2015 by Berkay Mollamustafaoğlu

Whenever possible, we implement direct integration between OpsGenie and IT management services used by our customers. For example, we have direct inbound integration with JIRA that enables OpsGenie customers to create alerts and notify users for JIRA issues.

Read more »

Mobile App Performance Management|New Relic & Crashlytics

Mar 24, 2015 by Kadir Türker Gülsoy

Mobile development requires hard effort to meet the expectations and needs of customers. OpsGenie Mobile Apps provide a user-friendly UI in parallel with a good user experience design; however, mobile development needs much more! Mobile apps should be fast, stable and memory-friendly beside providing a user-friendly UI. Therefore, we are monitoring OpsGenie Mobile apps continuously with the help of New Relic and Crashlytics to be able to improve our apps continuously (and of course applicatively).

Read more »

New OpsGenie iOS Mobile App

Mar 20, 2015 by Kadir Türker Gülsoy

Mobile applications are first response tool for incident management most of the time;  therefore, being fast, stable and usable is essential for a mobile app. We have redesigned our  iOS application from scratch as a native iOS app that takes advantage of all the new capabilities  added by Apple recently. After being beta tested thoroughly for over 2 months (thanks!),  iOS app is now available in Apple Store!

Read more »

Monitoring scripts and cron jobs using OpsGenie heartbeat messages

Dec 26, 2014 by Halit Okumuş

As most organizations, we have a number of jobs that run periodically in the background, as well as jobs (scripts) that get triggered under certain conditions to automate various tasks; from moving files around and backing up data to generating reports. It can be rather difficult to monitor these jobs and get notified when cron jobs or scheduled tasks silently fail or don't complete on time using regular monitoring tools.

OpsGenie Heartbeat monitoring can be used to monitor periodic jobs as well as ad-hoc or irregularly executed scripts. By sending a heartbeat message in the beginning and one in the end of the script, you can generate an alert if the script fails or does not complete within the expected time frame. OpsGenie heartbeat monitoring API supports defining, enabling, disabling heartbeat configurations as well as sending heartbeat messages, so it can be used from any programming language that can make HTTPS requests. In addition, we provide a golang based executable that makes use of these functionalities for you. You don't even need to write code; all you need to do is download the executable and you'll be able to effectively monitor your batch jobs.

Read more »

Alerting for applications errors using webhooks

Oct 23, 2014 by Berkay Mollamustafaoğlu

We've integrated OpsGenie with dozens of monitoring services using webhooks. A structured, secure web request is far better than email as an integration mechanism. Almost every monitoring SaaS provider offers webhooks as an alerting mechanism. But there is no reason alerting using webhooks should be restricted to monitoring/management services. It can and should be used by any SaaS application as a way integrate with other systems used by their customers of the service.

Read more »

Monitor your monitoring tools using heartbeats

Sep 24, 2014 by Berkay Mollamustafaoğlu

Monitoring tools are complex applications themselves, with multiple moving parts. Given that we rely on them to detect and alert us when there is a problem in our critical applications, it is essential to ensure that our tools are working as expected and can notify us when there is a problem. OpsGenie heartbeat monitoring offers a simple method to do just that:

Read more »

Designing alerts that help not just annoy

May 28, 2014 by Berkay Mollamustafaoğlu

"Empowering the alert recipients" has been a core principle for our product development since the beginning, driving many of the capabilities that differentiates OpsGenie from the alternatives. We believe that the role of the alerting system does not end with an alert notification that is devoid of any useful information. Sure, we need to make sure the right person is notified when there is a problem, but we cannot declare "mission accomplished" just because we’ve told someone that there is an alert. We believe that if we’re to interrupt someone, at the very least ask for the attention, worst wake them up, we ought to provide the relevant information that would enable them to assess the severity and the urgency of the problem as well.

Read more »

How do you manage alerts during code deploys?

Apr 21, 2014 by Berkay Mollamustafaoğlu

As organizations embrace DevOps and Continuous Deployment, it’s becoming common to do frequent code deploys, often multiple times a day. Deployments inevitably cause monitoring tools to generate alerts as applications & servers become temporarily unavailable/unresponsive. This can be problematic since these alerts:

  • generate noise, and unnecessarily interrupt people
  • mislead people and cause them to waste time chasing down nonexistent problems
  • erode attention and cause people to miss real problem

Read more »

Troubleshooting problems from a chat room with Slack and OpsGenie

Apr 15, 2014 by Berkay Mollamustafaoğlu

OpsGenie provides the ability to execute actions directly from OpsGenie apps. In the post “you woke me up. now what?”, I’ve described how this capability can be used to gather additional information and enable alert recipients to assess problems efficiently.

Read more »

Routing phone calls using on-call schedules

Apr 7, 2014 by Berkay Mollamustafaoğlu

OpsGenie routes alerts to the right person using policies, on-call schedules, etc. defined by users. Over the last year, we’ve heard similar questions from number of OpsGenie customers: “Can we route phone calls to the right person like we route the alerts?”.

Read more »
1 2 3 4