Recently OpsGenie hosted a joint webinar with Signal Sciences, where Berkay Mollamustafaoglu, OpsGenie Co-Founder and CEO and Zane Lackey, Founder/CSO, Signal Sciences Corp, discussed how to secure your web applications using OpsGenie and Signal Sciences.
We at OpsGenie continuously work hard to add new capabilities to our product. We are proud to announce the BETA release of a new resource that will help alleviate daily stress: Mass Notifications!
A few months back we held a webinar in conjunction with Logentries where we demonstrated “2 Use Cases for Using Logs in Incident Management.” In the webinar we examined traditional log management and IT alerting tools integrations along with the future of incident management with enriched alerts. Now if you’re already an OpsGenie or Logentries user or using monitoring and collaboration systems, like New Relic or Slack, this blog post might be of some interest to you. You will learn about improving your alert notification processes and incident response times.
CA Flowdock “Team Inbox” Integration
- OpsGenie now released a new version of CA Flowdock Integration. Now you can forward OpsGenie alert activities to CA Flowdock "Team Inbox" using Flowdock (Team Inbox) Integration. You can use the Flowdock (Chat) Integration to forward the alert activities to Flowdock "Chat Window".
- Please refer to Flowdock (Team Inbox) Integration for more details.
Heartbeat API v2 Released
- OpsGenie released Heartbeat API v2 which is a new RESTful API implementation designed around RESTful principles.
- Heartbeat API v1 will be deprecated by the end of June, 2017.
- You can reach Heartbeat API v2 documentation from here.
- OpsGenie now has bi-directional integration with Mattermost!
- Mattermost is an open source, private cloud Slack-alternative platform.
- OpsGenie is an alert and notification management solution that is highly complementary to Mattermost.
- With this integration, OpsGenie's alert notifications go right into your Mattermost channel. In turn, you can execute actions such as acknowledging, closing or assigning alerts from the channel by just typing a /genie command.
- For more information, please visit Mattermost Integration document.
SolarWinds Web Help Desk Integration
- SolarWinds Web Help Desk automates IT help desk and asset management operations. Also centralizes IT asset discovery and management.
- You can easily integrate SolarWinds Web Help Desk with your OpsGenie account and get notifications from OpsGenie when a ticket is created on SolarWinds Web Help Desk.
- For more information please visit SolarWinds Web Help Desk Integration Document.
You spend time creating your OpsGenie configuration, so it’s now crucial to ensure that your account configuration is safe and reproducible just in case something goes wrong.
Git is a distributed version control system that records changes to a file or set of files over time. This allows tracking changes to each file and recalling a specific version later. It’s primarily used by developers as a source code management system; however, it can and should be utilized in Operations, too. Git is efficient for not just code, but also for all kinds of alternative files such as configuration files.
Earlier, OpsGenie introduced the new “OpsGenie Configuration Management Tool.” This tool uses OpsGenie Java SDK to export OpsGenie configurations as JSON files enabling customers to take advantage of Git to manage the configuration data. Got that? The tool also supports restoring configurations, which means that it not only tracks configuration changes, but also reverts to a previous configuration if needed.
New Customers Yahoo, HubSpot, Overstock Sign On and Fuel Company’s Momentum; AppDynamics Chairman Jyoti Bansal Joins as an Advisor, and Google Executive Izzy Azeri Joins the Board.
Falls Church, VA – January 25, 2017 –OpsGenie, a leading alerting and on-call management platform for engineering and operations teams, today announced record growth for 2016 and new strategic advisors for the company.
Slack is great!
It is not just a messaging app; it is so much more with its apps. OpsGenie’s integration with Slack is so thorough that it has become a top app in the Slack app directory!
OpsGenie's Slack app has built-in slash commands, which is an efficient way to execute actions on a channel. For example, you have the option to create an OpsGenie alert in a channel by typing a command like /genie alert [alert message] for [user team]. Slash commands are very powerful and cover the majority of the use-cases. However, there may be more you want to do in your Slack channel.
As opposed to the traditional software business, the modernday software business is platform-based, which empowers integration with a widening world of products and services. Thus enabling new business ecosystems. The resulting ecosystems enable results and achievements that are way more effective and capable compared to individual product offerings. In this new software world, companies co-evolve their capabilities, cooperate to support consumer needs at every stage, eventually conceiving the next big innovations of technology.
The partnership comes as SolarWinds discontinues SolarWinds Alert Central and recommends OpsGenie to its customers as an effective alternative
AUSTIN, Texas – January 4, 2017 –SolarWinds, a leading provider of powerful and affordable IT management software, today announced a strategic partnership with OpsGenie, a leading alerting and on-call management solution for development and operations teams. The partnership comes as SolarWinds discontinues its free alert aggregation tool, SolarWinds® Alert Central.
Last week we were proud to announce a strategic partnership with Solarwinds! As SolarWinds announced the End-of-Life of their Alert Central (AC) product, they found OpsGenie to be the best alerting and On-Call (incident) management platform on the market to recommend to their customers.OpsGenie was enthusiastically ready to make this transition as smooth as possible.
Monitoring is an important part of DevOps. You need to monitor; your servers, containers, and now your serverless functions. Function as a Service (FaaS), also called “Serverless” architecture, is a relatively new concept. AWS Lambda is by far the most popular FaaS/Serverless solution in the market. It is an event driven, serverless computing service.
Custom User Role Updates
- Users that are not an admin or owner can now be granted with the right to update active or planned Maintenance Policies.
- You can refer Users and Roles for further information about custom user roles.
ConnectWise Automate (LabTech) Integration
- ConnectWise Automate, formerly LabTech, is the industry-leading IT automation software designed to allow you to automate any IT task to improve your IT services. ConnectWise Automate allows you to discover all devices and users so they can be proactively monitored.
- You can easily integrate ConnectWise Automate with your OpsGenie account and get notifications from OpsGenie when an alert is created on ConnectWise Automate.
- For more information please visit ConnectWise Automate Integration document.
SolarWinds and OpsGenie Partnership to Meet Customers’ Alerting and On-Call Management Needs
The partnership comes as SolarWinds discontinues SolarWinds Alert Central and recommends OpsGenie to its customers as an effective alternative
1. Manage your alerts on the go!
Are email notifications enough to control your IT infrastructures behavior? No, not anymore. Email notifications are no longer as effective when there are other channels of communication available; such as phone calls, SMS or iOS & Android push notifications.
It is that time of the year again! New hopes and motivations to improve our lives for the better! Yes, we are talking about New Year’s resolutions. :) Aside from your personal resolutions, we would like to focus on the positive changes you can make to your professional life.
As incident arbitrator, we all know how complicated and stressful it is to work on time-critical incidents. Every extra minute your team spends on resolving an incident is valuable and may have a devastating impact on your business and customers. So what can you do to minimize the time, effort, and the stress related to major incidents?
Berkay here! Early this week, OpsGenie had the opportunity to host a very exciting, Customer Appreciation event. During the week of the AWS re:Invent conference we opted to foster relationships outside the conference. The OpsGenie team flew into Las Vegas on Monday night, and boy are our arms tired. Ok. Bad joke. :)
Anyways, the next morning we made the most of our day by beginning it with a company breakfast at the MGM Grand Hotel, where we discussed business and logistics for the event. But first, we took a walk along the strip for some fresh air and adventure. It was a beautiful day in Vegas!
At OpsGenie, we do our best to enhance our customer's experience using our platform. We consider their needs and develop features that will promote efficiency and growth.
We believe that our product is already feature-rich, yet we’re continuing to further develop it to promote customization and flexibility. Our team works daily to continue product enhancements; and they’re always ready to extend it with customizations, as you need.
Backup is key in information technology. Data loss may be irreversible causing huge inconveniences to people and companies. When you have a large number of administrators who constantly make changes to the system, the risk of making incorrect actions is high.
Have you ever lost data with an accidental click? Was your hardware corrupted or your PC infected by viruses and cyber attacks which led to data loss? Research validates that almost 50% of people do not backup their data and then uselessly tries to recover it.
In today’s world, most organizations use a team-based structure. With this, organizations strive to define responsibilities, build the right skill sets, distribute workload, and eventually maximize productivity and success.
At OpsGenie, we care about our customer’s flexibility to adapt our software to their organizational needs.
At OpsGenie, we do our best to be punctual and early adapters of new features released by the companies we integrate with. We want to stay flexible while providing a complete workflow and feature set.
Not so long ago, Slack introduced, “Interactive Buttons,” which OpsGenie had a great tested use case for! So, we started working on it. As an early adapter, our previous Slack Application (App) was already part of the directory; which led us to build a brand new Slack App.
We, at OpsGenie, work to make integrating our product into the user’s existing toolset as smooth as possible. Which may seem like an unmanageable task since each customer has a unique model in conducting their daily operations.
ChatOps: Productive Chatting
A particular model of operation that’s on the rise is ChatOps. This model revolves around team communication and aims to unify all of the user’s tools, workflows, and processes into a single chatroom. An example might include a bot that pulls your code from GitHub and deploys it straight into one of your Amazon EC2 machines. All you need to do is type the command that triggers the bot to follow the action and voilà! Everyone on your team will also be able to follow the procedure live from the chatroom. Among other benefits, this ChatOps unified model makes operations more transparent and increases team productivity.
For us, incident responders and managers, incident management is a complicated beast that requires an active effort to streamline an effective workflow to identify, analyze, and solve the incidents. Failing to notice a problem is intolerable for us, on the hand, we don't want too many alerts and notifications that cause alert fatigue and may lead to longer response times or to missing significant incidents. To prevent these two crucial needs from becoming a dilemma, we are excited to introduce to you a set of new features: Auto Restart and Alert Count Based Notification Policies.
If you haven’t noticed yet, Nintendo's new game, Pokémon GO, is literally taking over the world. Pokémon GO reinvents a classic game with an augmented reality twist. The game is not available worldwide yet; however, that has not prevented users from downloading and catching 'em all. According to a Forbes article by Jason Evangelho, Pokémon GO is about to surpass Twitter in daily active users on Android and is #1 on Google Play Store. Incredible success!
What tools do you need to be an on-call warrior or hero? In this blog post we’ll examine the instruments needed to be on-call. Our goal is to provide you with on-call gear and tool ideas that help make your on-call experience successful; whether you’re an on-call novice or pro. We welcome any additional tips from our readers by responding to this post through our social media channels -- LinkedIn, Twitter, Facebook or Google+.
Falls Church, VA—June 29, 2016—OpsGenie, an emerging player in the critical area of IT alerting and on-call management, has raised $10 million in Series A financing from Battery Ventures, a global investment firm. OpsGenie will use the funds to continue to tackle the biggest challenges faced by customers in providing “always-on” services, and specifically to continue investing in its product and building out its go-to-market capabilities. As part of the financing, Battery General Partner Neeraj Agrawal and Battery Vice President Paul Drews will join OpsGenie’s board.
With OpsGenie you have the option of executing actions directly through our OpsGenie app. We have described how this capability can be used to gather additional information and enable alert recipients to assess problems efficiently in the post “You woke me up, now what?”
Directly through slack you can (1) forward alert activity to Slack channels and (2) allow users to interact with alerts, acknowledge, comment, close, etc. Refer to our blog post titled “Bi- Directional Integration with Slack”
In the coming weeks OpsGenie will help buyers looking for a reliable, scalable, and customizable alerting and incident management solution by assessing features, toolsets, and functionality in a comprehensive comparison between OpsGenie, Pagerduty, and VictorOps through a series of detailed blog posts. It is our goal to shed light on who does what, and the stark realities between the three popular technologies. OpsGenie will concentrate on areas within our platform that we believe are extremely important when looking for an alerting and incident management solution for the dev&ops and IT community in general. This week we will focus on Email Integration.
PART 1: Email Integration -- A comparative assessment of the direct contrasts between OpsGenie, Pagerduty, and VictorOps. OpsGenie’s email integration enables customers to integrate OpsGenie with any system that can send alerts via email. Email integration is the most commonly used integration method by our customers since it is easy to use and almost any system out there can send emails.
For most of us in ops, it is vital for us to get notified asap about problems that impact the services we provide. It’s often a race against time to restore the service or to prevent an outage. But not all alerts require an immediate response, some can wait. Enabling users to deal with alerts that don’t require an immediate response efficiently, is just as important in preventing alert fatigue, to ensure we can stay fresh.
At OpsGenie our mission is to empower our users to be able to handle critical as well as non critical/urgent incidents efficiently.
The OpsGenie team recently had a thorough and heated discussion (KAPOW!!) on who would be better with on-call alerts and incident management, Superman or Batman? Who would come out as winner when pitted against each other in a war of on-call alerts and response time? So, we thought we would hash it out here on our blog in a completely fictional format. We’ll try to examine each area of alerting and incident management to see who we think we would want on our on-call team.
As if you needed another reason to love OpsGenie and all its capabilities- We released an OpsGenie app for the hi-tech, sleek Apple Watch; where now, you can get the most out of your weekends and look stylish doing it.
Continuing with the discussion on how OpsGenie can help alleviate alert fatigue we will be examining areas where on-call employees take specific bulk actions to reduce the excessive alerts that often hinder operations.
The concept of “Alert Fatigue” is well known in industries such as healthcare, and awareness is increasing in IT operations as well. Fighting alert fatigue has been a key design objective for OpsGenie since our inception. Summarized in the earlier post, some of the key capabilities that OpsGenie provides can be used to alleviate alert fatigue. In a two part series, I go into more detail on how these features can improve the alert signal to noise ratio.
Since we launched the OpsGenie phone call routing feature last year, we’ve had an enormously great response from customers. So much, in fact, that we’re dusting off this blog post from last year and updating it for everyone who is not as familiar with it. Is it easy to use? Yes, it is! You see, OpsGenie routes alerts to the appropriate on-call individual using a method of policies, on-call schedules, etc.. Prior to the launch of the application last year, we heard similar questions from a number of our OpsGenie customers, such as “Can we route phone calls to the right person like we route the alerts?” This turned out to be a great question, one that resonated with many of our customers. For a product team, customer feedback like this is priceless!
As an alert notification solution, our first priority is to ensure that the right person is notified when there is a problem. OpsGenie sends multiple notifications through different channels, escalates etc. to ensure that critical alerts don’t get missed. As crucial as that is, if an alert notification system just stops at “waking you up”, it becomes part of the problem rather than a solution.
Every service provider wants their services to be available 24x7x365. But outages and planned maintenance are inevitable occurrences for online software services. Dealing with outages and communicating with users during the outage is as important as the availability of the services provided. To keep users informed, many service providers use web based “status pages” that contain up to date information about the health of the services, incidents, and what the provider is doing to resolve the issues.
OpsGenie is an incident management system for Dev & Ops teams. Customers use OpsGenie to consolidate their alerts generated by monitoring systems and route them to the right people using on-call schedules and escalations. Because OpsGenie is an essential tool used during outages and we have vital information about the incidents; our customers have been inquiring if we can create “status pages” programmatically based on the alerts generated in OpsGenie.
Responding this request, we’ve taken up the challenge to provide this solution to manage status pages for OpsGenie customers.
As long as our applications are in production, boosting uptime and avoiding outages is the highest priority for us developers and operational teams. Despite the great care, having 100% uptime and avoiding outages is a challenging task for even the most stringent DevOps teams. Let’s imagine that one of your data centers stops responding and in-turn your email service is completely out, or your payment service has gone offline during Black Friday. Remember the AWS outage that lasted four days and affected countless numbers of cloud services in April 2011. This is a good example that outages happen even to the most secure environments.. Now what? Are you going to examine huge log files to find out what went wrong? Are you going to notify all of your operational teams and developers at the same time to investigate the cause? Unless you allocate large resources for chaos engineering like Netflix does, you most likely will have very limited time to overcome the issue. So those aren’t realistic options for most organizations.
I’ve spent many years implementing traditional enterprise IT operations management tools. Integrations among various tools are often the Achilles’ heel of management systems. Integrating various applications is often a high-risk endeavor for customers. Enterprise vendors typically charge tens of thousands of dollars for integration “plugins”, and the implementation requires highly skilled (and expensive) engineers. To make matters worse, enterprise vendors are often not keen on collaborating with their competitors, let alone collaborating to help their customers. Vendors sometimes even block these integration efforts. I’ve witnessed a vendor not selling their product to prevent them from integrating with it (how is that for putting the customer first).
OpsGenie Webhook integration provides great flexibility to build solutions for specific requirements. In this blog post, we'll build a real-time dashboard for OpsGenie alerts. This dashboard will provide a quick overview of the most recent open alerts in OpsGenie, and when there are new activities on alerts, the dashboard will reflect these changes immediately.
The solution leverages AWS API Gateway and Lambda services as a serverless backend and PubNub for the real-time data stream. Both AWS and PubNub offer free tiers.
OpsGenie integration family has many members and still growing. The objective of this blog post is to explain using one of the newly added ones, Logstash. Logstash is a data pipeline that helps you process logs and other event data from a variety of systems.
A Logstash pipeline in most use cases has one or more input, filter, and output plugins. Logstash has a rich collection of input, filter, codec and output plugins. A filter plugin performs intermediary processing on an event. Filters are often applied conditionally depending on the characteristics of the event. An output plugin sends event data to a particular destination. Outputs are the final stage in the event pipeline.
In the previous blog post, we've gone through how to create JIRA issues from OpsGenie alerts, using open source Marid utility. Marid approach is particularly useful when integrating OpsGenie with an on-premise, self-hosted JIRA instance since Marid initiates the connection and does not require opening up the network.
Whenever possible, we implement direct integration between OpsGenie and IT management services used by our customers. For example, we have direct inbound integration with JIRA that enables OpsGenie customers to create alerts and notify users for JIRA issues.
Mobile development requires hard effort to meet the expectations and needs of customers. OpsGenie Mobile Apps provide a user-friendly UI in parallel with a good user experience design; however, mobile development needs much more! Mobile apps should be fast, stable and memory-friendly beside providing a user-friendly UI. Therefore, we are monitoring OpsGenie Mobile apps continuously with the help of New Relic and Crashlytics to be able to improve our apps continuously (and of course applicatively).
Mobile applications are first response tool for incident management most of the time; therefore, being fast, stable and usable is essential for a mobile app. We have redesigned our iOS application from scratch as a native iOS app that takes advantage of all the new capabilities added by Apple recently. After being beta tested thoroughly for over 2 months (thanks!), iOS app is now available in Apple Store!
As most organizations, we have a number of jobs that run periodically in the background, as well as jobs (scripts) that get triggered under certain conditions to automate various tasks; from moving files around and backing up data to generating reports. It can be rather difficult to monitor these jobs and get notified when cron jobs or scheduled tasks silently fail or don't complete on time using regular monitoring tools.
OpsGenie Heartbeat monitoring can be used to monitor periodic jobs as well as ad-hoc or irregularly executed scripts. By sending a heartbeat message in the beginning and one in the end of the script, you can generate an alert if the script fails or does not complete within the expected time frame. OpsGenie heartbeat monitoring API supports defining, enabling, disabling heartbeat configurations as well as sending heartbeat messages, so it can be used from any programming language that can make HTTPS requests. In addition, we provide a golang based executable that makes use of these functionalities for you. You don't even need to write code; all you need to do is download the executable and you'll be able to effectively monitor your batch jobs.
We've integrated OpsGenie with dozens of monitoring services using webhooks. A structured, secure web request is far better than email as an integration mechanism. Almost every monitoring SaaS provider offers webhooks as an alerting mechanism. But there is no reason alerting using webhooks should be restricted to monitoring/management services. It can and should be used by any SaaS application as a way integrate with other systems used by their customers of the service.
Monitoring tools are complex applications themselves, with multiple moving parts. Given that we rely on them to detect and alert us when there is a problem in our critical applications, it is essential to ensure that our tools are working as expected and can notify us when there is a problem. OpsGenie heartbeat monitoring offers a simple method to do just that:
"Empowering the alert recipients" has been a core principle for our product development since the beginning, driving many of the capabilities that differentiates OpsGenie from the alternatives. We believe that the role of the alerting system does not end with an alert notification that is devoid of any useful information. Sure, we need to make sure the right person is notified when there is a problem, but we cannot declare "mission accomplished" just because we’ve told someone that there is an alert. We believe that if we’re to interrupt someone, at the very least ask for the attention, worst wake them up, we ought to provide the relevant information that would enable them to assess the severity and the urgency of the problem as well.
As organizations embrace DevOps and Continuous Deployment, it’s becoming common to do frequent code deploys, often multiple times a day. Deployments inevitably cause monitoring tools to generate alerts as applications & servers become temporarily unavailable/unresponsive. This can be problematic since these alerts:
- generate noise, and unnecessarily interrupt people
- mislead people and cause them to waste time chasing down nonexistent problems
- erode attention and cause people to miss real problem
OpsGenie provides the ability to execute actions directly from OpsGenie apps. In the post “you woke me up. now what?”, I’ve described how this capability can be used to gather additional information and enable alert recipients to assess problems efficiently.
OpsGenie routes alerts to the right person using policies, on-call schedules, etc. defined by users. Over the last year, we’ve heard similar questions from number of OpsGenie customers: “Can we route phone calls to the right person like we route the alerts?”.