One thing is for sure when it comes to enterprise IT: disruptions are inevitable. From cyberattacks and natural disasters to supply chain failures and technological outages, and even just plain old bugs, threats to business continuity loom large.
The question is no longer if disruption will occur, but when – and how prepared organisations are to weather the storm. With mission-critical systems forming the backbone of modern enterprises, downtime isn’t merely inconvenient – it’s potentially catastrophic to both operations and reputation.
As a result, operational resilience is the most crucial thing of all, said Paul Flavin, managing director of enterprise IT solutions provider Triangle.
Operational resilience is the ability of an organisation to maintain its critical operations through disruptions, allowing it to withstand, adapt, and recover from adverse events. In other words, to keep the business running at all times.
“Really, that’s the core of our business. What it all boils down to is keeping an environment, the applications and everything it supports, up,” he said.
Triangle does not manage applications, though its clients do use some of the best-known enterprise applications and Triangle works to ensure they are able to run without problems.
“We then manage those environments. If you think of everything outside the application itself, whether it’s Oracle or SAP, we work on that. We keep it as bulletproof as possible,” he said.
In part, this is because the hardware, network and operating system layers tend to be more problematic than the actual applications themselves.
“When you see problems, it’s seldom down to the applications; and our view is that the best thing you can do is ensure you have maximum operation resilience, whether that’s all on-premise, in private cloud or going to the hyperscale public cloud,” he said.
Indeed, where IT infrastructure should be built is, increasingly, seen as a question of resilience, and while public cloud is easy, there is a growing trend toward private cloud.
“We’re seeing more repatriation. Some of that is for regulatory reasons, in things like insurance, but a lot of it is also about cost control and, yes, resilience,” he said.
Given that even brief outages can result in significant financial losses, damaged customer relationships, and regulatory complications, resilience has shifted from being an IT concern to a boardroom priority, with executives increasingly recognising its strategic importance.
You have got to be operationally resilient at all times, even if you can recover down the road
Naturally, larger enterprises, which are often able to invest and may also be subject to stricter compliance, tend to lead the way.
“Enterprise doesn’t necessarily mean multi-billion turnover companies with 10,000 employees. It can be things like clinical science; what it means is people who are invested in technology,” he said.
Of course, the EU’s Digital Operations Resilience Act (DORA) means that businesses in financial services are legally obliged to take resilience seriously, but Flavin said all organisations should be thinking about it.
“You’ve got to be well managed and you have to be prepared. You have got to be operationally resilient at all times, even if you can recover down the road,” he said.
When the day comes
The reality is, though, that the growing cyber security threat alone poses a challenge to any organisation that needs to stay up and running. As a result, Flavin said, recovery is about a lot more than backups.
“Our position is you can be as secure as you want, and you absolutely should invest in security, but you’re going to get hit,” he said.
“People get attacked, perhaps they aren’t breached, but the defences are attacked. Anyone that has a reasonable business, whether they have to be compliant or not, they want to get their business back,” he said.
Triangle, while agreeing that organisations have to spend on security, has a slightly different proposition, asking ‘what happens if?’
The goal is to minimise downtime as much as possible, even in the face of an attack.
“What will you say to the board? Do you have a clean room environment, with clean data, that you can get up and running by 9 am tomorrow?”
In order to do this, a data feed is taken from production data, scanned by artificial intelligence (AI) to ensure it is free of threats, and then it is stored in an air-gapped vault.
However, this vault is not simply a repository for copies of data.
“If the AI detects a problem, it stops that and alerts you, so your process can be put to work. You then run your clean room as your production environment until such time as the production environment is found to be clean.”
This approach is more valuable than backups, as it is a fully functional production environment, Flavin said.
“It’s a minimal viable company, but it means you’re not going to be sitting on your hands for a week. It’s not a static record of your data.”