Podcast

Root Causes 426: Expired Certificate Takes Down Bank of England

Hosted by
Tim Callan
Chief Compliance Officer
Original broadcast date
September 30, 2024

A certificate expiration is now known to have created July's outage of Bank of England. Join us as we shake our heads in amazement yet again.

Podcast Transcript

Tim CallanTim CallanSo Jason, quick one. Easy one. I am looking at an article in The Stack from September 30, 2024, and I'll just give you the headline. People can find it on their own. The headline reads, Expired Certificate Crashed $6 Trillion Bank of England System. So what happened here?
Jason SorokoJason SorokoIt sounds like certificate lifecycle management was not used in some major system, and a certificate expired, bringing down a very, very important system.
Tim CallanTim CallanSo this was an outage in July, to be clear, and what's happened now is the new information revealed is the root cause, and it was just good old fashioned, certificates expire. They all expire, right Jay?
Jason SorokoJason SorokoThey do. And in fact, we've done podcasts on the reasons why you want that to be the case. You want certificates to expire.
Tim CallanTim CallanThey must expire. Otherwise, why have them? It's not the only reason, but it's the biggest reason to have certificates. And yes, so the consequence of that was that the $6 trillion Bank of England system went down. And so, as we talk about so often, you say, what was the cost to Bank of England and its ecosystem and the people who depend on it, and how many orders of magnitude is the cost of that compared to the cost of putting some simple automation certificate lifecycle management in place to solve this problem upfront.
Jason SorokoJason SorokoI don't understand why people wouldn't buy the cheapest insurance possible, which is certificate lifecycle management. If you're going to be using certificates, every certificate should be a managed certificate. I'm going to keep repeating that until I don't have to anymore. I don't know when that's going to be.
Tim CallanTim CallanInsurance is the perfect analogy. It's like all of these people are driving without insurance, and if they don't get into an accident today, that's fine. But if they do get into an accident, then they're just up the creek. And we see this over and over and over again where these things would have been avoidable. And you and I are always just rattling off the big lists of these outages. And here's another one to put on the list. A major, major, major financial institution. Plenty of resourcing. Plenty of security savvy. Plenty of IT resources. Plenty of money to spend on getting things right, and yet, for some reason, they just didn't.
Jason SorokoJason SorokoYou're right. Bank of England, they easily could have put in some sort of an automated system to avoid this problem, but did not. It shocks me how resourced and also how very smart technically, and I'll say for the Bank of England, that would be employed by a ton of staff who understand how to calculate risk, and including technical risk.
Tim CallanTim CallanYes. It's a major financial institution. Calculation of technical risk is a core competency.
Jason SorokoJason SorokoLike they should be some of the smartest people on the planet for that. So, I'm almost without words. But Tim, it's just these kinds of examples of certificate outages. We report the major ones. There are probably certificate outages going on as we speak right now that probably won’t even get reported on.
Tim CallanTim CallanThat make it in the headlines. So, yes. So the other point to remember is, when it's Bank of England or Nintendo or Microsoft, then it makes it to the point where you and I talk about it. When it's Bank of - I always pick on Bank of Sacramento. I don't know why. When it's Bank of Sacramento, it doesn't make the headlines, and we don't talk about it. By the way, no offense to anybody at Bank of Sacramento. I'm sure you do a great job. We don't talk about it. Or some 500 person company somewhere. We don't talk about it at all. The other thing to remember is, we usually don't know what the root cause of an outage or breach is. It is very rare that we get enough information that we know that it involves an expired certificate that wasn't renewed. That is a rare occurrence. Most of the times there's these breaches, there's these outages, and nobody who's not an insider at that company actually understands why it occurred, which makes me believe, even if I can't prove it, that the number of actual high profile outages that owe themselves to certificate management problems, including unrenewed certificates, is considerably higher than what you and I see in the newspapers.
Jason SorokoJason SorokoI don't know how simple we have to put this, but technical folks, people who are in the trenches, technical leaders, risk officers, if you have any form of digital system in your organization, you are probably running certificates somewhere, and every certificate needs to be a managed certificate. And what this is trying to show you by doing this podcast is a lot of you still have unmanaged certificates that you're not keeping track of, and are going to bring down your system at some point, and it's going to be bad.
Tim CallanTim CallanAnd it's interesting. So you and I, I think in general, we want to talk about something new and something different, because that's the point of the podcast, because all the old episodes are available. Like, I'm not going to record another episode where I define certificate agility, because we already have that episode. So we talk about this a lot, and we've talked about it a lot of times. And I think there's a danger of you and me thinking, well, that's been discussed. But it hasn't soaked in. Maybe it's been discussed, but it hasn’t been internalized. And so in that regard, I think we're gonna remain vigilant about making sure that we bring this up in in the podcast.
Jason SorokoJason SorokoRoot Causes Episode 117, back November 2020 - -
Tim CallanTim CallanThere’s a blast from the past.
Jason SorokoJason SorokoThat is total certificate agility. Please go and listen to that. Even if you've heard it a couple times, go and listen to it again. Please. We would love for the extinction of these kinds of podcasts right now, where we talk about gigantic outages in major systems, where there was no excuse.
Tim CallanTim CallanWhere they were completely avoidable. That's the thing. Completely avoidable. An unrenewed certificate is about the most avoidable outage you're gonna find.
Jason SorokoJason SorokoThe technology to solve it exists. It's not 1995. The technology to solve this exists. And it's not even priced out of your price band. Go get it. It exists. This can be solved. You don't have to worry about it or think about it, just go and solve it.
Tim CallanTim CallanYep. I agree, Jay. All right. Well, this has been Root Causes.

Stay informed with expert insights

Subscribe to Root Causes for engaging discussions on PKI, digital security, and best practices for protecting your organization's critical assets. Don’t miss an episode!

Listen on Apple PodcastsListen on SpotifyListen on SoundCloud