With the Corona pandemic, the term «resilience» has taken on a new and much greater importance for companies. Being able to maintain business operations in a crisis situation has gone from a hypothetical to a real-life scenario.
The Covid crisis has clearly shown that hardly any company or public authority today can afford not to maintain services simply because employees work from home. The infrastructures that enable working from home have become a critical backbone of business: A functioning remote connection is key to ensure the availability of business applications.
While management focuses on the digitalization of the business, as this is where cost savings are likely to be made, IT resilience often falls by the wayside. This is a delicate matter, because the increasing digitalization is turning the availability and stability of services into a decisive success factor.
Developing the business further and promoting innovation are important prerequisites for ensuring success in the market. At the same time, however, the existing infrastructures must not be neglected, thus allowing customers to use the services without interruption.
Digitalization makes companies of all sizes and industries more dependent, because with today's possibilities, for example the integration of cloud services, the complexity of IT systems increases. The risk of a disruption or failure increases exponentially. This is contradicted by the fact that IT costs are to be reduced in most companies. Thus, not only are efficiency improvements required on an ongoing basis. But there is also a lack of time and money for necessary investments in information security, the privacy, availability and integrity of applications – although hardly any company can afford a failure of one or more IT systems.
Studies show: If the IT systems of a company with fewer than 500 employees are down for one hour, the damage amounts to at least CHF 20,000. For a company with more than 500 employees, the damage increases exponentially.
Disruptions and failures are not only annoying, but can develop into an existential threat. A well-known example is crypto trojans and ransomware, i.e. when an attacker obtains data through fake e-mail access, encrypts it, and extorts money from the company for the release of the data.
There is little or no public coverage of IT system disruptions and failures. As a customer, we usually learn about it when an error message appears such as «Due to a technical problem, the service is currently unavailable. Our specialists are working flat out to find a solution.»
With increasing digitalization, companies are forced to make IT resilience a management issue. The topic is not extremely complex, but the necessary competence must develop over time.
IT resilience cannot generally be achieved with a single technical measure such as backup computers, backup in the cloud or fail-safe storage. It is more about getting an overview of the business processes, their priorities, the potential impact of disruptions on customers, production and administration (see BCM).
When considering individual services in terms of business continuity management (BCM) and IT resilience, it is important not to limit oneself to the technical perspective. Where every franc should be invested also depends on the user of the services, who generates business. In other words: Every franc invested in IT resilience should create the greatest possible business value.
Experience shows that if the product managers are involved in the topic, valuable discussions result, leading to a common understanding between business and IT as to which applications and which services create value for the company. Applications that are in operation for a certain period of time are usually less developed. They tend to become legacy and thus a risk from an IT resilience perspective. Early planning, re-engineering or even replacement of applications helps to invest the money in the best possible way.
As an entry point into IT resilience according to TechDebt (technical debt), the establishment of a structured risk analysis has proven effective so that those responsible become aware of the operational risks. Risk areas can be easily defined and assigned to a responsible person who plans and implements mitigating measures.
If the risks are categorized according to probability and potential damage, the defined measures can be prioritized quite easily according to cost and effect. Of course, the risk catalog must be periodically reviewed following a process to be defined and adapted to current conditions.
Resilient IT operations are not just a management task, but a company-wide issue. IT resilience is a process to be established, which must be periodically triggered by the management and be incorporated into the culture of the company at all levels.
[snippet_article_cta id="article-cta-it-resilience-hope-en"]