Redirecting you to
Podcast Apr 15, 2024

Root Causes 378: Why Are Forced Revocations So Difficult?

In the latest in our ongoing series of discussions of the Bugzilla Bloodbath, we delve deep into the problem of failure to revoke on time and the multiple causes that lead to this ongoing failure. And what to do about them.

  • Original Broadcast Date: April 15, 2024

Episode Transcript

Lightly edited for flow and brevity.

  • Tim Callan

    Okay, so we're following up. We've had a series of episodes we've done on the craziness that's going on on Bugzilla right now and if you haven't listened to them, the ones to go to look at is one called Drama on Bugzilla. One called The Bugzilla Bloodbath, and one we very recently recorded about, Does CPS and Issuance Misalignment Constitute a Revocation Event? And those are all episodes that are good to focus on, but I think what we wanted to talk about today as part of all of this craziness on Bugzilla is the incredible number and degree of incidents we have right now. Active incidents on Bugzilla where the gist of it is that the CA did not revoke the certificates on time.

  • Jason Soroko

    Right. Misissuance. Right. And I think in very recent podcasts that you just recorded, you made a really good case for are the rules that we have, are they good rules? Should we be following these rules? And we're talking about the CPS specifically. Look, I think there are multiple potential reasons for misissuance.

  • Tim Callan

    Yeah. There are many.

  • Jason Soroko

    I think you could probably, Tim, off your top of your head, you could probably rhyme off three or four. No problem. Right?

  • Tim Callan

    I could probably rhyme off a dozen no problem. But go on.

  • Jason Soroko

    There you go. There you go.

  • Tim Callan

    Yeah.

  • Jason Soroko

    I'm gonna go all the way back to Root Causes Episode 128, What is Total Certificate Agility?

  • Tim Callan

    Okay.

  • Jason Soroko

    And that was a term that you coined, Tim, back in November 2020 and it what it makes me think about is this topic of misissuance. We’ve now had three podcasts in this flurry of Bugzilla activity.

  • Tim Callan

    Yeah.

  • Jason Soroko

    That's a lot of podcasts to cover what's been going on. And that should give you some indication of just how much unprecedented activity there is, as you just stated. So what I'd like to argue is this. Let's find the pattern here, Tim and simplify this for the listener, which is this. Total certificate agility is a little different than crypto agility, right? Crypto agility is all about, hey, what happens if RSA gets deprecated? Or what happens in the quantum apocalypse? Oh, my God. You need crypto agility. Right? That's a whole other topic. Certificate agility covers all the other reasons why you might need to swap your certs quickly and misissuance, to me, is a major fundamental reason why you want total certificate agility and that is because – and I'm gonna say it, and I think you've said this before - misissuance is inevitable amongst the CAs. Regardless of your CA, it’s inevitable, Tim.

  • Tim Callan

    Yeah. There is a certain amount of it that just happens. It's just the rules are very precise, and very complex and very detailed, and very large in scope. And so there isn't a sizable CA on the planet who in the course of a year doesn't wind up declaring some kind of misissuance event. It's just not realistic. It would be tantamount to saying, well, why don't you ship software without bugs? And you just can't - - like, we'd all like that but, in the real world, software is sufficiently complicated, that you can't really ship software without bugs. It's the same thing. You can't really operate a public CA at any kind of scale without a certain small amount of background noise misissuance occurring.

  • Jason Soroko

    Tim, I think in Episode 373, which wasn't that long ago, we were talking about the - I think the title of that was the Bugzilla Bloodbath - and part of what was being discussed there was this incongruity between the EV rules and the BRs.

  • Tim Callan

    Yeah.

  • Jason Soroko

    So like, even as complex as these things are, due to their complexity, there are even incongruencies that can lead to hey, you misissued against a certain set of rules, but not the other set of rules but it is still a misissuance.

  • Tim Callan

    Right. It’s hard. It's hard. So to your point, Jason, this is a thing that can occur and it's just part of the landscape, but by the way, it wouldn't have to be misissuance. There's other reasons why sometimes certificates need to go away right now.

  • Jason Soroko

    Exactly.

  • Tim Callan

    And a great example is private key compromise. If your private key is compromised, your certificate is not secure. Someone else has the key to the lock.

  • Jason Soroko

    Root Causes 128 my friends. Total Certificate Agility. Tim covers a pile of these. All these things, which, hey you might say to - - like, here is my main point of harping on misissuance, Tim. It's that I'm declaring it inevitable for every single procurer of publicly trusted certificates. You are going to deal with an email from your CA at some point that says, hey, we need to revoke X, Y, and Z certificates and sometimes that number of certificates is gonna be a large number.

  • Tim Callan

    Yeah. And so, so let's get to - - let's touch on something that's very important is right now according to - as of today's recording, it might be different when this episode goes up - but as of today's recording, I count 15 open bugs in Bugzilla about failure to revoke certificates on time. Jason, this is completely unprecedented. Most of the time, there are zero.

  • Jason Soroko

    On top of that, Tim - - I don't want to break your flow, but I'm just gonna add a point. Some of the CAs and it's not about naming names, some of the CAs are fighting revoking the misissued certificates. It's not the issuance that anybody has a problem with. It's the reaction of saying, this is too painful to me, it is too painful to my customers, and this is what this podcast is about. We are in 2024.

  • Tim Callan

    Not some. All. Like all of those 15 - The reason that there is a bug is because they didn't get their certificates revoked in time. I don't believe in any one of those cases, it's because we did not have the technical capability to revoke these certs. I'm pretty sure that every single one of them was, we determined that it would be too difficult, inconvenient for our subscribers, for us to revoke these certificates within the time period.

  • Jason Soroko

    Geez.

  • Tim Callan

    This is not okay. This is not okay. Like and there's two ways you can look at it. You say, look, either the subscriber just flat out can't replace the certificate within that time period and if you revoke the certificate on time, it's going to go dark, or the subscriber can replace the certificate within that time period. Now let's explore those scenarios.

    Both of them. Let’s say we say that the subscriber can't replace the certificate within that time period. Okay. That's really, really, really bad. Because if it is a Heartbleed kind of situation, or a compromised private key kind of situation, you're just plain exposed. Your security is zero. And if you're incapable of making your security not be zero under those circumstances, then you're just not secure. Pure and simple. Lousy answer. Awful answer.

    Now, let's suppose on the other hand, that the subscriber can get it done within the time period, but they just don't want to. It's inconvenient. I want to do something else. It's expensive. It makes me grouchy. I have to tell my boss.

  • Jason Soroko

    It's Christmas.

  • Tim Callan

    It's Christmas. That is definitely not an adequate excuse for a public CA not to do the thing that it's supposed to do. And if the public CA is going to accept that, then what's happening is the public CA is putting its own subscribers ahead of the larger public. And the only reason we're public CAs is because we made a promise to the larger public. Look at our earlier episode about CPS misalignment. We made a promise to the larger public, and we have to follow that promise. So for CAs to just declare, oh well, you know, it's hard. It's difficult. I understand. It’s not the CA's role in the ecosystem, right? There can be a person who works for the CA and their role is to control the customer and say, oh, I’m so sorry. Those meanies over in compliance. But somewhere along the line in the CA where the buck stops, they need to do the revocations on time. And, you know, I was asked a question very directly in last October’s, the October 2023, face-to-face by one of the browsers, what are the circumstances under which you would allow a delayed revocation and we have a policy at Sectigo. And it is if we determine that there is risk to human health and safety. If there is risk to human health and safety, we will allow a delayed revocation. No other reason. And so when you look at these large number of CAs that have got these weak mealy mouthed excuses, oh, these guys, they can't get it done. These are important functions that people need. They're things like banks and government agencies. No. That is not sufficient. That is not an okay reason to delay revocation. That is a CA not following the promise it made when it was given the privilege of being one of the stewards of public trust, and it is a privilege, and these are CAs who are not honoring their part of the bargain.

  • Jason Soroko

    Listeners to this podcast, I'd like to speak on behalf of all of you and say - -

  • Tim Callan

    I'm getting fired up again.

  • Jason Soroko

    Have you ever heard? Do you hear Tim? That's legitimate passion for doing the right thing. That's what you're hearing there. Okay? And it's because what we do is really serious. In fact, let me break the fourth wall here for a moment and say, we recently recorded a podcast with Bruno Couillard who has been around even longer than us and is just one of my, you know, one of the visionaries that I look up to. And I would say, he said something during one of those podcasts about it affects society in total. It's no less than that. Some of the fundamentals of how commerce works in our modern world depends on these things being done in the most forthright way possible and this is why we - - not only we are fired up, but many people in this industry get fired up about this subject. Tim, since you're in this mode, I want to give you the first crack. I'm going to offer that there are three solution categories for this.

    I think if the world puts in place three things, and I think you know what those are, then we will be no longer in this situation where we've got to jump on a podcast and get as fired up about this because it will be become much more of a non-issue because the pain of mass revocation will mostly go away, Tim.

  • Tim Callan

    Okay.

  • Jason Soroko

    I'm going to drop it on you. Give me what those categories are.

  • Tim Callan

    So one of them is - and this might be overly broad so tell me if you need me to be more specific. One of them is automation of certificate installation.

  • Jason Soroko

    100%, Tim. We've got to, got to, got to have automated certificate renewals – Number one.

  • Tim Callan

    Right. Number two is comprehensive discovery of certificates.

  • Jason Soroko

    100%. You can't manage what you don't know you have.

  • Tim Callan

    Right.

  • Jason Soroko

    And therefore, right, and that leads me to thinking about managed certificates in general.

  • Tim Callan

    Mm-Hmm, yeah. And so is that number three, Jay? I was gonna ask, what's number three?

  • Jason Soroko

    It can be number three.

  • Tim Callan

    What’s number three?

  • Jason Soroko

    Let’s say it is Certificate Lifecycle Management. It is just blanket, the five pillars of CLM. We're calling out number two as being like fundamental invisibility. Discovery, we can call out separately. Why not?

  • Tim Callan

    Yeah. Yeah.

  • Jason Soroko

    I'm going to offer a fourth, Tim.

  • Tim Callan

    Okay. What's that?

  • Jason Soroko

    That is shortened certificate lifespans go a long way to dealing with this issue, Tim. You can explain why.

  • Tim Callan

    Well, sure. So, for starters, you know, the stakes get lower, right? If certificates are around for less time, the overall risk is less. The problem around things like misissuance is less. This whole CPS misalignment thing that we talked about in an earlier episode is less. The other consequences of misissuance are less. And by the way, the consequences of these problems and vulnerabilities are less. If a private key is stolen, there's less time for somebody to exploit it. So all of this, all of the bad things get reduced as soon as certificate lifespans get shorter. And then the need for revocation gets reduced, right?

    Go with an extreme thought experiment with me, Jay. Let's pretend that all public TLS certificates were no more than 10 days in duration, and I had a five day revocation event. That means that 50% of my certs are going to expire before the revocation time comes. I don't even need to do those.

  • Jason Soroko

    Exactly.

  • Tim Callan

    Now, let's make the thought experiment extreme. Let's say that all the certificates were five days in duration. I would never have to do a five day revocation event at all. I would just fix my practice and go forward. Right?

  • Jason Soroko

    Tim, if you can do five-day certs, ten-day certs, you can do total certificate agility. Period.

  • Tim Callan

    Absolutely. Or even 90-day certs, right? 90 days is the next step that's coming and when we get to 90 days - - So then the last reason and the last reason why shorter certificate lifespans are good, which you bring up, is because one of the things we've seen is that IT departments out in the world at enterprises, for whatever reason, do not get the air cover they need - the budget and resourcing and roadmap they need to automate their certificates. And I think the reason they don't get that is because they can limp along without it. And shortened certificate lifespans will mean that they can't limp along without it and what that means is that the people with the purse strings, and the people with the roadmap decisions, and the people in the board rooms will finally give them the green light to do what I know a lot of them want to do and so it becomes a forcing function for IT teams to get these things prioritized and get them on the roadmap and delivered.

  • Jason Soroko

    Exactly. Tim, isn't it incredible that we live in a world where you and I are having legitimate podcasts about the future of quantum computing and artificial intelligence and yet people are fighting - - there are a faction of people literally fighting against the basics of Certificate Lifecycle Management.

  • Tim Callan

    Yeah, absolutely. And such a - - like, it's just straightforward automation, right? It doesn't require any advanced science. It doesn't require anything that we haven't known about computers for a long, long time. It's just a matter of doing the work.

  • Jason Soroko

    Being able to swap your certs. That means automating the renewals. That means having visibility to all the certs you need to. Discovery. Number three, the other four pillars of CLM and then shorten certificate lifespans. Folks, this is how to end this conversation about the pain of misissuance and mass revocation.

  • Tim Callan

    And if we bring it back to the point I was making earlier, CAs that break the rules to prevent revocation occurring in a reasonable timeframe don't help. They make it worse because they take away the real pain of not doing this right. Right? They go out of their way to prevent the consequences of, you know, somebody's ill preparedness from actually hurting that somebody and what is the upshot of that? Continued ill preparedness.

    So a CA that bites the bullet and says, I'm really sorry. I know this is gonna be unpleasant, but we all have to get here eventually is doing the public good. More good. It’s doing the web PKI more good than a CA that enables the continued lack of agility, lack of certificate agility in its subscribers.

  • Jason Soroko

    Hey, Tim. Guess what?

  • Tim Callan

    What?

  • Jason Soroko

    There's this fine print inside of the Google's 90-day announcement in their blog about the shortening of certificate lifespans down to 90 days where they talk about CAs - - in order to be able to trust - - a trusted CA by the Chrome root program, you will have to have some form of automation offered by the your CA; otherwise, you can't be a CA.

  • Tim Callan

    As of January 1, you are not allowed into the Chromium root program as a new CA if you don't support automation capabilities. Like they know this is the future and they're unabashed about it and they're also unabashed about one of the reasons they want to shorten certificate lifespans is exactly to as a forcing function for automation because automation just makes everything better.

  • Jason Soroko

    It does. And I think in this podcast, Tim, we've spelled out the three or four top things that you just got to go and do. Hey, let's make 2024 - - by the end of this year, let's get most of the world's publicly trust certificates, managed. In other words, discovered and fully managed and automated, and then we can start to live in a world where we can get certificate lifespans down really short, and we no longer have to have these conversations.

  • Tim Callan

    I completely agree. Let's do that.

  • Jason Soroko

    Right on, Tim.

  • Tim Callan

    All right. Thank you, Jason.

  • Jason Soroko

    Thank you.

  • Tim Callan

    This has been Root Causes.