Monitoring scripts and cron jobs using OpsGenie heartbeat messages

Dec 15, 2014 by Halit Okumus

As most organizations, we have a number of jobs that run periodically in the background, as well as jobs (scripts) that get triggered under certain conditions to automate various tasks; from moving files around or backing up data to generating reports. It can be rather difficult to monitor these jobs and get notified when cron jobs or scheduled tasks silently fail or don't complete on time using regular monitoring tools.

OpsGenie Heartbeat monitoring can be used to monitor periodic jobs as well as ad-hoc or irregularly executed scripts. By sending a heartbeat message in the beginning and one in the end of the script, you can generate an alert if the script fails or does not complete within the expected time frame. OpsGenie heartbeat monitoring API supports defining, enabling, disabling heartbeat configurations as well as sending heartbeat messages, so it can be used from any programming language that can make HTTPS requests. In addition, we're providing a golang executable that makes use of these functionalities for you. You don't even need to write code; all you need to do is download the executable and you'll be able to effectively monitor your batch jobs.

Alerting for applications errors using webhooks

Oct 23, 2014 by Berkay Mollamustafaoglu

We've integrated OpsGenie with dozens of monitoring services using webhooks. A structured, secure web request is far better than email as an integration mechanism. Almost every monitoring SaaS provider offers webhooks as an alerting mechanism. But there is no reason alerting using webhooks should be restricted to monitoring/management services. It can and should be used by any SaaS application as a way integrate with other systems used by their customers of the service.

Monitor your monitoring tools using heartbeats

Sep 24, 2014 by Berkay Mollamustafaoglu

Monitoring tools are complex applications themselves, with multiple moving parts. Given that we rely on them to detect and alert us when there is a problem in our critical applications, it is essential to ensure that our tools are working as expected and can notify us when there is a problem. OpsGenie heartbeat monitoring offers a simple method to do just that:

How to use CloudWatch to generate alerts from logs?

Aug 12, 2014 by Veli Burak Celen

At the latest AWS NYC Submit, Amazon announced "CloudWatch Logs", a log storage and monitoring feature that enables AWS customers to monitor and troubleshoot systems and applications using system, application and custom log files. CloudWatch Logs currently lacks some of the essential log management capabilities like search and sophisticated visualizations, nonetheless it is a major leap in functionality for CloudWatch.

CloudWatch Logs enable AWS customers to easily move logs off of individual EC2 instances into a central repository, and browse the logs via the web UI. But the most appealing feature of CloudWatch Logs is arguably the ability to monitor the logs for specific phrases, values or patterns, and generate alarms from them. CloudWatch Logs support variety of use cases:

SaaS integrations put traditional enterprise software to shame

May 31, 2014 by Berkay Mollamustafaoglu

I’ve spent many years implementing traditional enterprise IT operations management tools. Integrations among various tools are often the Achilles’ heel of the management systems. Integrating disparate applications is often a high risk endeavor for customers. Enteprise vendors typically charge tens of thousands of dollars for integration “plugins”, and the implementation requires highly skilled (and expensive) engineers. To make the matters worse, enterprise vendors are often not keen on collaborating with their competitors. Let alone collaborating to help their customers, vendors sometimes block integration efforts. I’ve seen a vendor not selling their product to another, to prevent them from integrating with their product (how is that for putting the customer first).

Designing alerts that help not just annoy

May 28, 2014 by Berkay Mollamustafaoglu

"Empowering the alert recipients" has been a core principle for our product development since the beginning, driving many of the capabilities that differentiates OpsGenie from the alternatives. We believe that the role of the alerting system does not end with an alert notification that is devoid of any useful information. Sure, we need to make sure the right person is notified when there is a problem, but we cannot declare "mission accomplished" just because we’ve told someone that there is an alert. We believe that if we’re to interrupt someone, at the very least ask for the attention, worst wake them up, we ought to provide the relevant information that would enable them to assess the severity and the urgency of the problem as well.

How do you manage alerts during code deploys?

Apr 21, 2014 by Berkay Mollamustafaoglu

As organizations embrace DevOps and Continuous Deployment, it’s becoming common to do frequent code deploys, often multiple times a day. Deployments inevitably cause monitoring tools to generate alerts as applications & servers become temporarily unavailable/unresponsive. This can be problematic since these alerts:

  • generate noise, and unnecessarily interrupt people
  • mislead people and cause them to waste time chasing down nonexistent problems
  • erode attention and cause people to miss real problems

Routing phone calls using on-call schedules

Apr 7, 2014 by Berkay Mollamustafaoglu

OpsGenie routes alerts to the right person using policies, on-call schedules, etc. defined by users. Over the last year, we’ve heard similar questions from number of OpsGenie customers: “Can we route phone calls to the right person like we route the alerts?”.

Try OpsGenie for free!