How many servers can be managed by one system administrator? This question is pretty hard to answer since it depends decisively on the tasks that need to be operated. It is clear, however, that the amount of servers one engineer can manage has increased tremendously over the time, and is still growing. Public and private clouds, in combination with automation tools, enables us to automate many daily tasks. In a modern IT infrastructure almost everything can, and should, be automated. Starting from the creation of a new instance up to software deployment. In this whole scenario, automated monitoring is an essential component.
We pride ourselves at OpsGenie for being the most reliable and flexible alert and incident management solution. However, what happens when you simply don’t want notifications? Even with escalations, routing rules, and on-call schedules, you may want extra configuration on when you are notified, and for what types of alerts.
Do you receive your support and/or internal queries through calls? OpsGenie’s Incoming Call Routing allows you to manage your phone numbers and how the calls to these numbers are being routed from one single place. Use your on-call schedules and escalations to determine the right team member to route the incoming call to, and make sure the call is not missed, just like your alerts.
Before the philosophy of DevOps, developers would build products, services, and infrastructures , but the responsibility for maintaining them would shift to operators, aka system or IT admins. The DevOps philosophy removes the boundary between Operations and Development teams, making system reliability a shared responsibility of all parties.
Incident response procedures for IT incidents are similar to the processes required for emergencies in the medical field. In previous posts we’ve compared on-call responders to doctors on-call- called during emergencies and expected to contain and remedy the problem, preventing loss and reducing impact. Using priority is a great method of alert enrichment to accomplish this. Download our white paper to learn more best practices to maximize resolution speed.
At OpsGenie, we integrate with the tools your team already uses in order to provide the best alert management and incident response platform possible. Our integrations team is always hard at work to provide these integrations for not only new tools, but tools requested by customers specifically. If an integration does not exist for a monitoring, ticketing, chatops, reporting, or any other kind of tool your team uses, our integrations team can evaluate to see if one can be made. Contact us for more information or requests!
Adopting a true DevOps culture is chock-full of challenges such as shifting away from legacy infrastructure to a more microservices-centric approach, integrating tools, managing priorities, environment provisioning, traceability, and more. However, implementing an approach in which Dev teams and Ops teams can work together, using the same assets, can drastically change the way an organization is able to move forward with each new release. For DevOps teams, proactively monitoring “left” of production can be a revolutionary element in achieving these goals.
This is an excerpt from our newest White Paper, Modern Incident Management for IT Operations, available for download now.