Mobile development requires hard effort to meet the expectations and needs of customers. OpsGenie Mobile Apps provide a user-friendly UI in parallel with a good user experience design; however, mobile development needs much more! Mobile apps should be fast, stable and memory-friendly beside providing a user-friendly UI. Therefore, we are monitoring OpsGenie Mobile apps continuously with the help of New Relic and Crashlytics to be able to improve our apps continuously (and of course applicatively).
Mobile applications are first response tool for incident management most of the time; therefore, being fast, stable and usable is essential for a mobile app. We have redesigned our iOS application from scratch as a native iOS app that takes advantage of all the new capabilities added by Apple recently. After being beta tested thoroughly for over 2 months (thanks!), iOS app is now available in Apple Store!
As most organizations, we have a number of jobs that run periodically in the background, as well as jobs (scripts) that get triggered under certain conditions to automate various tasks; from moving files around and backing up data to generating reports. It can be rather difficult to monitor these jobs and get notified when cron jobs or scheduled tasks silently fail or don't complete on time using regular monitoring tools.
OpsGenie Heartbeat monitoring can be used to monitor periodic jobs as well as ad-hoc or irregularly executed scripts. By sending a heartbeat message in the beginning and one in the end of the script, you can generate an alert if the script fails or does not complete within the expected time frame. OpsGenie heartbeat monitoring API supports defining, enabling, disabling heartbeat configurations as well as sending heartbeat messages, so it can be used from any programming language that can make HTTPS requests. In addition, we provide a golang based executable that makes use of these functionalities for you. You don't even need to write code; all you need to do is download the executable and you'll be able to effectively monitor your batch jobs.
We've integrated OpsGenie with dozens of monitoring services using webhooks. A structured, secure web request is far better than email as an integration mechanism. Almost every monitoring SaaS provider offers webhooks as an alerting mechanism. But there is no reason alerting using webhooks should be restricted to monitoring/management services. It can and should be used by any SaaS application as a way integrate with other systems used by their customers of the service.
Monitoring tools are complex applications themselves, with multiple moving parts. Given that we rely on them to detect and alert us when there is a problem in our critical applications, it is essential to ensure that our tools are working as expected and can notify us when there is a problem. OpsGenie heartbeat monitoring offers a simple method to do just that:
At the latest AWS NYC Submit, Amazon announced "CloudWatch Logs", a log storage and monitoring feature that enables AWS customers to monitor and troubleshoot systems and applications using system, application and custom log files. CloudWatch Logs currently lacks some of the essential log management capabilities like search and sophisticated visualizations, nonetheless it is a major leap in functionality for CloudWatch.
CloudWatch Logs enable AWS customers to easily move logs off of individual EC2 instances into a central repository, and browse the logs via the web UI. But the most appealing feature of CloudWatch Logs is arguably the ability to monitor the logs for specific phrases, values or patterns, and generate alarms from them. CloudWatch Logs support variety of use cases:
"Empowering the alert recipients" has been a core principle for our product development since the beginning, driving many of the capabilities that differentiates OpsGenie from the alternatives. We believe that the role of the alerting system does not end with an alert notification that is devoid of any useful information. Sure, we need to make sure the right person is notified when there is a problem, but we cannot declare "mission accomplished" just because we’ve told someone that there is an alert. We believe that if we’re to interrupt someone, at the very least ask for the attention, worst wake them up, we ought to provide the relevant information that would enable them to assess the severity and the urgency of the problem as well.
As organizations embrace DevOps and Continuous Deployment, it’s becoming common to do frequent code deploys, often multiple times a day. Deployments inevitably cause monitoring tools to generate alerts as applications & servers become temporarily unavailable/unresponsive. This can be problematic since these alerts:
- generate noise, and unnecessarily interrupt people
- mislead people and cause them to waste time chasing down nonexistent problems
- erode attention and cause people to miss real problem
OpsGenie provides the ability to execute actions directly from OpsGenie apps. In the post “you woke me up. now what?”, I’ve described how this capability can be used to gather additional information and enable alert recipients to assess problems efficiently.
OpsGenie routes alerts to the right person using policies, on-call schedules, etc. defined by users. Over the last year, we’ve heard similar questions from number of OpsGenie customers: “Can we route phone calls to the right person like we route the alerts?”.