Recipients and owners of IT services demand from their providers high reliability and availability of IT systems.
Availability means time of a trouble-free service operation in relation to total time in which it should work. Nowadays it is not a problem to ensure it on the level of 90 or 99% since business & IT environment change dynamically. Regardless of the industry a given organization operates in, it is still not hard to imagine the consequences of critical systems operation at the level of 90% which would equal to a weekly break in work of 16 hours!
At the below chart we present causes of unavailability of IT systems. No matter whether it results from a human error or a device breakdown, in majority of cases we are able to prevent or reduce to a minimum a breakdown time. Efficient IT management of an organization requires tools to monitor availability and capacity of services with components: servers, systems and applications. IT processes and procedures in the company affect a trouble-free usage of an IT infrastructure. They all have the same objective: an optimization of IT infrastructure and the organization ability so that it ensures an efficient in terms of costs and stable level of availability – always when it is necessary. For a business Client it means a trouble-free realization of assumed goals.
According to the ITIL libraries, management of availability ensures that the services are available whenever a Client needs them and it results from business requirements, necessary infrastructure, IT processes and procedures. The objective of this process is to optimize the ability of the IT infrastructure and the support organization to provide a cost-efficient and stable availability level which will enable the business Client to realize his goals.
Availability management include among other:
- Availability optimization by monitoring and reporting of all its key elements
- Anticipating and designing of expected availability and security levels
- Collecting, analyzing and keeping data available as well as reporting them
- Ensuring that the levels of services will comply with tests of availability levels according to SLA
- Monitoring of OLA objectives and achievement of possibility of providing services by external providers
- Systematic increase of services availability and their components
Thanks to a process approach to availability management, we may reduce frequency and time of an IT infrastructure breakdown. Insufficient services levels will be identified, it is easier to perform correction works. The staff changes their attitude from reactive to proactive.
For past years, the IT infrastructure monitoring tools have evaluated along with a development of IT systems. At the beginning leading providers offered tools for management, monitoring and analyzing of activity in the network. The solutions introduced to the markets facilitated monitoring of operational systems and their management. Those days it was possible to detect problems early on the basis of alerts events coming from managed systems. Additional support for administrators was a presentation of performance data and reports that helped them plan capacity of the IT infrastructure.
Another step was monitoring of a system platform and applications working on it – also from the final user’s point of view, by both – active and passive communication mode. Later systems identify individual Clients, applications and transactions. In these terms, the most modern achievement of the IT is the service including components of network, server and application layer. Today IT systems are treated as an integral part. Administrators can manage the infrastructure from operation and business perspective because – IT is business and business is IT.
Server infrastructure monitoring is fundamental for a complex system. Solutions dedicated for this purpose allow to proactively monitor servers and operational, database and application systems in dispersed environments. They also facilitate their management. This support covers a majority of presently used systems, built according to the agent–server–Client architecture. Data collection is performed throughout a set of various agents integrating with a monitored environment and providing a wide range of parameters determining the work of this environment. Apart from agent solutions, this system facilitates so called agentless monitoring, basing on a collection of popular network protocols. It is possible to define dynamic thresholds on the basis of historical data. Thanks to this users can efficiently analyze trends, they plan better usage of assets and identify sources of potential problems.
A solution to track real transactions. It works on the basis of specialized probes to which the copy of network traffic is redirected from network switches. The system allows to analyze network traffic with identification of individual Clients, applications and transactions and their components. This tool measures and collects real data concerning Client’s applications, network and servers’ response time. Thanks to it – apart from an analysis of correctness of working and system capacity – we can use collected data for SLA reporting needs.
From the final user point of view monitoring is important especially where it comes to a high availability of systems. Solutions provided by Cloudware Polska identify and resolve problems before they interfere with the operation of business systems. Additionally they enable to test services provided by networks. The benefit of an active monitoring is an analysis of the application availability even when a minor traffic or its lack are concerned. This happens because actions of a final user performed from various locations are simulated. Test scenarios are recorded in a dedicated tool for testing capacity and then uploaded directly to a server and distributed to monitoring agents. It is possible to distribute the same script to many agents in various locations which gives a wider perspective of the monitored environment.
Business Service Management class solutions allow to understand a complexity of relations between business services and technologies supporting them. System ensure an advanced real-time visualization of services stages and the processes concerning their incidents. The analysis of a current situation include incidents and data coming from a wide range of IT resources and business support systems. It is also possible to use e.g. the data from monitoring applications, systems of teletechnical infrastructure maintenance and also from repositories storing business data – such as a current transactions volume, incomes, Clients’ data, signed contracts. The tools that are in the Cloudware Polska portfolio are supported by solving problems thanks to an automatic analysis of the breakdown causes and its influence on business services but also by an integration with support systems and services providing. It gives a wider perspective of the services provided in operational and business silos, it visualizes key efficiency indicators (KPI, SLA). It also helps automatize the process of defining services model by an integration with the CMDB data. The solution of the BSM class that we have in place eliminates the risk, since it not only measures but also legibly visualizes quality level.
Business Services Monitoring
Thanks to a process approach to availability management, we may reduce frequency and time of an IT infrastructure breakdown. Users efficiently analyze trends, they plan assets usage in a better way and identify the sources of potential issues.
- Datacenter – Monitoring of server infrastructure. Proactive servers and operational, database and application systems in dispersed environments.
- Real User Experience – A solution to track real transactions. The system allows to analyze network traffic with identification of individual Clients, applications, transactions and their components.
- Synthetic Monitoring – From the final user’s point of view monitoring is important especially where it comes to a high availability of systems. Our solutions identify and resolve problems before a breakdown occurs. They also simulate users’ actions within tests.
- Services – Business Service Management class. Automatic analysis of breakdowns causes and their influence on business services.