BankingBanking Risk ManagementQ&A: Dynatrace’s Martin Bradbury on the growing risk of banking IT outages

Q&A: Dynatrace's Martin Bradbury on the growing risk of banking IT outages

With banking outages becoming more frequent and disruptive, financial institutions must rethink their approach to resilience before trust erodes further.

“I don’t think compensation ultimately cuts it. Compensation might go some way to offset the inconvenience of an issue, but fundamentally, having the issue in the first place is the problem,” says Martin Bradbury, UK&I Regional Director, Financial Services at Dynatrace.

With UK banks experiencing over 800 hours of IT outages in just two years, it’s clear that resilience remains a serious concern. Yet despite looming regulatory deadlines, major institutions are still struggling to prevent major disruptions.

Why? 

A growing crisis: Why IT failures are escalating 

The Treasury Committee’s inquiry into UK banking outages comes after several high-profile failures, most notably the Barclays disruption earlier this year that lasted over three days.

“We’ve had several high-profile incidents that have occurred in the first quarter of this year already. The Barclays incident, for example, which was the real catalyst behind this intervention, ran for over three days. The timing is very specifically related to the concentration we’ve seen in the past few months,” explains Bradbury.  

These failures expose systemic weaknesses in the sector.

“If you read the responses back to the Treasury Committee from the top nine banks and building societies, there are some common threads. Change-related disruptions, where banks update their own applications and infrastructure, are a major contributor. Reliance on third parties is another prevalent cause—failures in services banks depend on, but do not directly control,” Bradbury notes.

“Ultimately, complexity is the main contributing factor. Most of these organizations have a huge amount of technology platform complexity, with cloud, legacy systems, and digital services all operating together. Quite often, these banks are organized around individual parts of the technology stack, meaning they don’t always have great visibility of the end-to-end customer experience.” 

The real-world impact of outages 

While regulators focus on system failures from a compliance perspective, the human and business consequences are often underestimated.

“I think banks are actually very aware of the impact these outages have. There is a huge amount of focus on identifying quickly how many customers are impacted, what the significance of the impact is. But you can’t play down the level of disruption this causes to daily life.

“People can’t buy a train ticket, put fuel in the car, pay their bills, or get on a bus. And from a business perspective, it’s equally damaging—missed salary payments, invoice delays, cash flow problems,” Bradbury explains.

“It damages trust—not just between banks and their customers, but in the wider financial system. People expect these services to work like a utility; when they don’t, it’s real, and it’s stressful.” 

Can banks compensate for failure? 

Banks’ responses to outages vary widely. Barclays has allocated £12.5 million in compensation, while HSBC has reimbursed just over £230,000, and AIB only £590.

“The inconsistency raises questions, but the bigger issue is how banks respond to problems in real time. Having a clear set of communication when there’s an issue, being proactive in reaching out to customers, and giving as much clarity as possible on resolution timelines—these things matter.

“Most people are pretty reasonable. If they know there’s an issue and can plan around it, they will. But the unknown is often the real problem,” Bradbury says.

“That’s why transparency and timely communication should be a bigger focus than compensation figures alone.” 

Balancing digital innovation and risk 

As banks push forward with AI-driven services and cloud adoption, they must also contend with aging infrastructure and complex integrations. Bradbury stresses that success depends on having a clear, end-to-end view of the entire IT ecosystem.

“It all comes down to having clear end-to-end visibility of that ecosystem. You won’t necessarily prevent outages, but you will be able to see them early and identify the root cause faster. That’s where the investment needs to be,” he explains.

“Ecosystems are not getting simpler—they’re getting more complex. The difference between best-in-class organizations and those with work to do is the level of investment they’ve made in observability, sophistication, and looking at that full digital ecosystem.” 

But digital transformation comes with trade-offs.

“We talk about digital innovation, but what we really mean is speed to market—bringing new features to customers quickly. Change is a big instigator of outages, so banks need strong processes, rigorous testing, and real-time monitoring to balance speed with reliability.

“Everyone wants cool new features, but making sure an organization has the right controls to release them safely is just as important,” Bradbury says.

“The challenge is finding the middle ground between agility and the process rigor banks have historically relied on.” 

What banks must do now 

For financial institutions that have suffered multiple high-profile failures, Bradbury has clear advice:

“First, get full visibility into your IT environment. You can’t fix what you can’t see. Enhancing observability and real-time monitoring is the foundation for everything else.

“Second, shift to a customer-centric view of resilience. Instead of just tracking internal system health, banks need to monitor the actual digital journeys customers take—payments, loan applications, mortgage approvals. If you can detect degradation in real time, you can prevent an outage before it escalates.” 

The future of IT resilience in banking 

Are we moving towards a world where major banking outages are rare?

Bradbury is realistic: “I don’t think it’s ever going to be realistic to say an organization will have zero technical problems in their digital environment. It’s unlikely. It’s all about the speed with which they can respond to them. If you look at some of our customers, we’ve seen really significant incident reductions.

“The Bank of New Zealand had a 94% reduction in major incidents off the back of their work with Dynatrace. So, while complexity will continue to rise, we should also see increased sophistication in the way organizations observe and manage their environments. That, in turn, should lead to a significant reduction in the overall volume of incidents.” 

Banks can no longer afford to treat IT failures as an inevitability. Strengthening resilience is about maintaining trust in an industry where reliability is non-negotiable. 

Whitepapers & Resources

2021 Transaction Banking Services Survey
Banking

2021 Transaction Banking Services Survey

4y
CGI Transaction Banking Survey 2020

CGI Transaction Banking Survey 2020

5y
TIS Sanction Screening Survey Report
Payments

TIS Sanction Screening Survey Report

6y
Enhancing your strategic position: Digitalization in Treasury
Payments

Enhancing your strategic position: Digitalization in Treasury

6y
Netting: An Immersive Guide to Global Reconciliation

Netting: An Immersive Guide to Global Reconciliation

6y