Couple of weeks ago, we have announced direct integration with HipChat. We’ve been continuing to work on extending OpsGenie callback capabilities.
In operations, most of the time no news is good news. If we’re not receiving alerts from monitoring systems about problems, we tend to assume that all is well with the world. But what if we’re not receiving alerts because some part of our monitoring solution has not been working for days or even weeks? If you’ve ever found out about a problem with the monitoring systems after being asked why there was no alert for a particular problem, you know what I’m talking about. If you’re supporting a web based application or service, chances are you’re employing a monitoring service to monitor the availability of your application from the outside, preferably from multiple locations. At OpsGenie we do take advantage of external services to monitor availability of OpsGenie web UI, as well as the API end points. External web monitoring enables us to find out quickly when there is a problem with OpsGenie. In addition, OpsGenie has supported what we can “heartbeat monitoring" since the beginning. Heartbeat monitoring enables OpsGenie users to send OpsGenie periodic heartbeat messages. Heartbeat monitoring serves multiple purposes:
OpsGenie is fundamentally an alert router for operations teams. It receives alerts from operations management systems via email or API, and notifies the right people using the defined rules. OpsGenie also supports "callbacks", and can forward alert activity to external systems via webhooks. Every time an alert is created, acknowledged, commented, closed or when an action is executed by a user, OpsGenie makes a web request to the URL specified in the webhook configuration. The web request includes subset of the alert data in the body of the request in JSON format. Passed data includes the alert messages, as well as the alertId and the alias fields that can be used to retrieve the rest of the alert data via the OpsGenie Alert API. OpsGenie users can configure callbacks to be triggered for all alert data or can define matching rules to forward only a subset of alerts. Webhooks provide a very flexible way to export the alert data that is aggregated in OpsGenie, and are used in many different ways. Some example uses we’ve seen include:
Not all alerts are created equal nor they should be treated as such! Some alerts are critical and urgent and we want to receive notifications immediately using any and all notifications methods, and others can wait till the morning, or an email may be sufficient, etc. We find out it is as important for an alert notification system to NOT to wake you up unnecessarily as it is to ensure you wake up when it’s necessary. OpsGenie now puts the user in full control. Users can decide how to get notified for different alerts based on the alert data and the time of day.
Schedules and escalations are out of the beta
After a two month beta period, on-call schedules, rotations and escalations features have come out of beta and available to all Pro and Enterprise level subscribers. Several usability improvements have been rolled out based on the feedback we’ve received during the beta process. Thanks for all the feedback!
It is safe to say that monitoring tools and services universally support sending email alerts. Hence not surprisingly, creating alerts in OpsGenie via email is the most common integration method used by OpsGenie users. Based on on the feedback we’ve received from OpsGenie users, we’ve enhanced email integration capabilities to make it both easier and more flexible.