Heartbeat Monitoring as a Service

 heartbeat-monitoring-service-blog-image

This post was originally published May 1st, 2013
Updated on July 17th, 2018

 

King James I of England missed the mark when he allegedly proclaimed “No news is better than evil news”. While you may not be receiving any alerts, that may actually be indicative of your sender system malfunctioning. How are you supposed to know, then, if that is the case?

Schrödinger's cat

Like Schrödinger's cat, your silent system’s fate is unknown until you launch an investigation. Nothing may be wrong, or it may be behind on alerts that should have been sent to alarm you of current issues. A system in this scenario could be down for days, or even weeks before someone becomes suspicious enough to check it’s status. OpsGenie’s Heartbeats add another level of assurance that your systems, and the tools monitoring them, are available and functioning as expected.

 

 

Who’s Watching the Watchers?

No one wants to be confronted and asked why a problem occurred and nothing was done about it. Especially when it's at the fault of a system, and not human error in neglecting to take action on alerts that were never sent in the first place.

If you are supporting a web based application or service, chances are you are also employing some sort of monitoring tool to monitor the availability of said application or service. OpsGenie uses external services to monitor the availability of our Web UI and API endpoints to let us know immediately if we have a problem. These monitoring tools are not infallible, though. Which is why we’ve implemented Heartbeats from the beginning to enable OpsGenie users to send OpsGenie periodic heartbeat messages.

Heartbeats help accomplish several things:

  • Ensures continuous connectivity between your systems and OpsGenie so alerts are sent and received without disruption.
  • Your Heartbeat sender systems are working as expected.
  • Since Heartbeat requests are processed like any other API request, users can use it to monitor the availability of the OpsGenie API.
  • Can notify you when backups do not complete, periodic tasks are not completed in the expected time, or if scheduled reports failed to be generated and delivered.

The way it accomplishes these goals is defined by the user. Define when an event should happen and this indicates how often OpsGenie should expect to receive a Heartbeat message. These messages are sent to OpsGenie to confirm the systems are working and connected as expected.  This way, you can track the frequency of heartbeats, their messages, and know when they expire. If no Heartbeat is received within a specified interval, then an alert is generated and routed through policies to the correct team via escalations and on-call schedules.

We recently redesigned our Heartbeats feature for easier configuration and to make them team-based, and they now work as their own entity instead of an integration. All Heartbeat creation and management is done via the Heartbeats tab on your team’s dashboard.

Sample Scenario

Add Heartbeat

From the Heartbeats page add a new Heartbeat with a unique name and your desired expire interval. Alert fields for Heartbeat expiration alert will be populated with default values, but you can also set them yourself with your desired values.

create-heartbeat

You will see the designated email address made from the Heartbeat's name which you can use to send to OpsGenie.

An alert is created soon after Heartbeat expires, notifying the recipients.

heartbeat-expired

When a Heartbeat is active again, aka you have sent another heartbeat, your latest heartbeat will be displayed as “Active” under the Heartbeats tab on your Team’s dashboard. The expiry alert is closed automatically soon after, by completing the new Heartbeat.

heartbeat-closed-auto

Assure connectivity between OpsGenie and your tools, and even for OpsGenie itself, with Heartbeats. Using Heartbeats as a service allows users to “watch their watchers” and make sure they’re not being left in the silence to their detriment. Don’t let the fate of your monitoring tools be unknown or left in limbo. Take out a free trial of OpsGenie to stay aware and in control of all your alerts and tools, or subscribe to our Youtube channel for product features and how-to videos. 

New call-to-action