Dec 26, 2014 by Halit Okumus
As most organizations, we have a number of jobs that run periodically in the background, as well as jobs (scripts) that get triggered under certain conditions to automate various tasks; from moving files around and backing up data to generating reports.
It can be rather difficult to monitor these jobs and get notified when cron jobs or scheduled tasks silently fail or don't complete on time using regular monitoring tools.
OpsGenie Heartbeat monitoring can be used to monitor periodic jobs as well as ad-hoc or irregularly executed scripts.
By sending a heartbeat message in the beginning and one in the end of the script, you can generate an alert if the script fails or does not complete within the expected time frame.
OpsGenie heartbeat monitoring API supports defining, enabling, disabling heartbeat configurations as well as sending heartbeat messages, so it can be used from any programming language that can make HTTPS requests.
In addition, we provide a golang based executable that makes use of these functionalities for you.
You don't even need to write code; all you need to do is download the executable and you'll be able to effectively monitor your batch jobs.
Oct 23, 2014 by Berkay Mollamustafaoglu
We've integrated OpsGenie with dozens of monitoring services using webhooks.
A structured, secure web request is far better than email as an integration mechanism.
Almost every monitoring SaaS provider offers webhooks as an alerting mechanism.
But there is no reason alerting using webhooks should be restricted to monitoring/management services.
It can and should be used by any SaaS application as a way integrate with other systems used by their customers of the service.
Sep 24, 2014 by Berkay Mollamustafaoglu
Monitoring tools are complex applications themselves, with multiple moving parts. Given that we rely on
them to detect and alert us when there is a problem in our critical applications, it is essential to
ensure that our tools are working as expected and can notify us when there is a problem. OpsGenie
heartbeat monitoring offers a simple method to do just that:
Aug 12, 2014 by Veli Burak Celen
At the latest AWS NYC Submit, Amazon announced "CloudWatch Logs", a log storage and monitoring feature that enables AWS customers to monitor and troubleshoot systems and
applications using system, application and custom log files. CloudWatch Logs currently lacks some of the essential log management capabilities like search and sophisticated
visualizations, nonetheless it is a major leap in functionality for CloudWatch.
CloudWatch Logs enable AWS customers to easily move logs off of individual EC2 instances into a central repository, and browse the logs via the web UI. But the most appealing
feature of CloudWatch Logs is arguably the ability to monitor the logs for specific phrases, values or patterns, and generate alarms from them. CloudWatch Logs support variety
of use cases:
May 31, 2014 by Berkay Mollamustafaoglu
I’ve spent many years implementing traditional enterprise IT operations management tools. Integrations
among various tools are often the Achilles’ heel of the management systems. Integrating
disparate applications is often a high risk endeavor for customers. Enteprise vendors typically charge
tens of thousands of dollars for integration “plugins”, and the implementation requires
highly skilled (and expensive) engineers. To make the matters worse, enterprise vendors are often not
keen on collaborating with their competitors. Let alone collaborating to help their customers, vendors
sometimes block integration efforts. I’ve seen a vendor not selling their product to another, to
prevent them from integrating with their product (how is that for putting the customer first).
May 28, 2014 by Berkay Mollamustafaoglu
"Empowering the alert recipients" has been a core principle for our product development since the
beginning, driving many of the capabilities that differentiates OpsGenie from the alternatives. We
believe that the role of the alerting system does not end with an alert notification that is devoid of
any useful information. Sure, we need to make sure the right person is notified when there is a problem,
but we cannot declare "mission accomplished" just because we’ve told someone that there is an
alert. We believe that if we’re to interrupt someone, at the very least ask for the
attention, worst wake them up, we ought to provide the relevant information that would enable them
to assess the severity and the urgency of the problem as well.
Apr 21, 2014 by Berkay Mollamustafaoglu
As organizations embrace DevOps and Continuous Deployment, it’s becoming common to do frequent code
deploys, often multiple times a day. Deployments inevitably cause monitoring tools to generate alerts as
applications & servers become temporarily unavailable/unresponsive. This can be problematic since
- generate noise, and unnecessarily interrupt people
- mislead people and cause them to waste time chasing down nonexistent problems
- erode attention and cause people to miss real problems
Apr 15, 2014 by Berkay Mollamustafaoglu
OpsGenie provides the ability to execute actions directly from OpsGenie apps. In the post “you woke me up. now what?”,
I’ve described how this capability can be used to gather additional information and enable
alert recipients to assess problems efficiently.
Apr 7, 2014 by Berkay Mollamustafaoglu
OpsGenie routes alerts to the right person using policies, on-call schedules, etc. defined by users.
Over the last year, we’ve heard similar questions from number of OpsGenie customers:
“Can we route phone calls to the right person like we route the alerts?”.
Mar 12, 2014 by Berkay Mollamustafaoglu
OpsGenie has supported direct integration with popular chat services HipChat and Campfire for quite some time via
our callbacks where OpsGenie
forwards alert activity to chat rooms.