Root Causes 448: The Privilege of Being a Public CA

Hosted by

Tim Callan

Chief Compliance Officer

Jason Soroko

Fellow

Original broadcast date

December 17, 2024

We go over Tim's September 2024 keynote speech at ENISA CA Day, "The Privilege of Being a Public CA."

Podcast Transcript

Lightly edited for flow and brevity.

Tim CallanJay, what we're going to talk about today, we're actually going to go back to September of 2024, and talk about something that we've been meaning to talk about, but there have been so many kind of urgent, pressing news items that it keeps getting delayed and delayed and delayed but I'm glad that we're finally doing it. And what that is, is, every year ETSI, as part of the ETSI conference, holds what they call CA Day. And CA Day is a time, a day of content that's focused on and for public CAs and TSPs, trusted service providers, in the ETSI universe. And I was very honored to be invited to be one of the keynotes at ETSI CA Day and at the time when the invitation came in, I was asked, if we wanted you to do this, what would you talk about. And what I said was, I want to talk about - I don't remember my exact wording of it, but the gist of it was, I want to talk about the privilege of being a public CA, and so the presentation that I gave there is called The Privilege of Being a Public CA. And what we were thinking of is we would just go through that and what the content of that presentation was here today.

And this, of course, had slides, so you're not going to get to see the slides, but I'll try to articulate this in a way that it comes across fine, and it’ll make sense. So with that, the privilege of being a public CA. I started actually talking about delayed revocation. So let's start there. So let's talk about delayed revocation. Jason.

And there's a word we use in the web PKI community. It's called delrev – d-e-l-r-e-v. That just means a delayed revocation bug. A Bugzilla bug that is occurred when a mandatory revocation doesn't get done on time and we've done a number of episodes on revocation. So if you want more details on that, go back and read those. And I want to focus in on what I call deliberate delayed revocation and that's kind of my coinage, but I think it's pretty obvious what that is. That's a delrev, where the CA has the technical and procedural ability to do the revocation and is aware that the revocation must be done, and yet, by choice, doesn't do the revocation. I want to distinguish deliberate delrev from accidental delrev.

The accidental delrev would be where the delay in revocation occurs unwittingly. Maybe there's a software error, or kind of an undetected procedural error, or possibly the CA doesn't realize that there's a bug and doesn't get the revocation done because somehow they don't know that that's supposed to happen. And so, accidental delrevs are their own problem, but they're a different problem from deliberate delrevs because an accidental delrev, if anything, is a competence problem. A deliberate delrev is a choice. It is a decision making problem, if you will. So the snapshot in September 2024 when I took this, as of 9/20/24, which is when I sat and did this research, not too long before the presentation occurred, there were 70 open bugs in Bugzilla. Of those, 22 were delayed revocation bugs and of those, all but two were instances of a deliberate delrev. So 29% of the open Bugzilla bugs, nearly 1/3 were deliberate, delayed revocation. Of the other two, one of them was an accidental delrev. It was what I talked about before. In that case, it was a procedural error, not a software error and then there was one meta bug, which was just kind of gathering together all the delayed revocations, because there was so much going on in that space. So 20 deliberate delrevs, which is pretty incredible, if you think about it because - -

Jason SorokoThat's a lot.

Tim CallanIt’s more than any other source of bugs in Bugzilla right now. There's no other single thing that is creating more bugs than deliberate delayed revocation is and again, what's crazy about that in particular is this isn't error. This is choice. And this is troubling to me, and it got to the point where, shortly, very shortly before I gave that presentation, a proposal was advanced by Mozilla that would put some really stiff penalties on both the CAs that gave delayed revocation, and the subscribers that received deliberate delayed revocation. And as such, that was a proposal that was directly in response to the problem that we were seeing. You and I have discussed that whole proposal in detail in a previous episode, so I'm not going to go into that here. If you want to go back and listen to that episode if you want more information on the Mozilla proposal.

And the point here, though, is that this wasn't a speech, and this today isn't an episode about delayed revocation. Rather, it's an episode about doing the right thing. And so delrev is just an example.

So in addition to delrev, there's other things. There's refrev, which is refused revocation, which is where the CA just decides not to do the revocation. There's failure to lint. There's failure to report problems when they do occur. And then another one that's rampant is failure to answer corrections or answer questions, correct previous errors or learn from other CAs.

I get questions about my behavior and why the bug answered and or why the bug occurred, and I just don't answer them, or don't answer them on time, or give vague, elusive answers or correct previous errors. People have an error. There's an error. They know it's an error. They're aware of the fact, and they don't go and do the work to make it not happen again. They knowingly just don't do it. Or a big one is failure to learn from other CAs. CA A has a problem, and then a year later, CA B has the exact, precise same problem, even though the whole community already watched the CA A walk through this, have a root cause analysis, identify action items and apply those action items. And yet, other CAs didn't know of it, didn't act on it, didn't apply the same learnings and didn't themselves from having the same problem.

So the question I posed is, is this what we want? Is this what we want for our community? Is this what we want for our industry? Is this what we want for the group of organizations that are supposed to be the gatekeepers of identity and trust for the entire world? And if you think about it, it is an incredibly rare thing that has been given. I don't even know the answer to this, Jason. So you and I are both guessing. But how many technology companies do you think are in the world?

Jason SorokoIt's almost impossible to answer because, I mean, there's probably so many below the waterline that are still in stealth mode that are going to be very important in the next five years. Who knows?

Tim CallanDo you know how many public CAs there are? Maybe 50. So what percentage of the tech companies in the world have been given this amazing privilege where they are allowed to vouch for digital identity of any digital entity anywhere in the world, and that isn't just SSL certificates. It’s S/MIME certificates, it’s code signing, it's document signing. Like what percentage of tech companies are given that privilege? It's vanishingly small. So to see the percentage of people who do have it treat it like it's nothing, or like it's their right, or like it's something to be squandered and ignored and poo-pooed is simply incredible to me.

Jason SorokoIt is true, Tim, and dare I say it, because this podcast isn't about selling, but not all CAs are created equal, that's for sure. And I think we can, I think we can just say that.

Tim CallanThen my next slide I hear say, is this what we want? CAs who act like their corporate owners trump the WebPKI. CAs who act like their local governments trump the WebPKI. CAs who do the minimum they can get away with. CAs who choose subscriber convenience over compliance. CAs who hide and misdirect and trick the community. CAs who don't know the basic rules for being a CA. CAs who make self-serving intellectually dishonest arguments in online forums and CAs who resist changes in CA/Browser Forum that are for the good of the community, for the sake of their own convenience and all of these things, not only have we seen every one of these in the last year, but we have seen every one of these multiple times. Some of them many times in the last year, and these are only the things we know about. Let me tell you some of the rules we have.

These are rules we have that are hard rules in the CA/Browser Forum. There is a rule requiring that CAs do pre-certificate linting, pre-issuance linting. That's crazy. Linting is such a wonderful best practice. It is a silver bullet that just knocks out so many sources of misissuance. It's so powerful and free linters are available that people put in open source that they do just out of the goodness of their hearts. Why on earth would any CA not do pre-issuance linting? Why on earth does there need to be a rule that forces them to do it. There is a rule that requires that CAs support automation. Automation, Jay. Are you a little bit of a fan of automation maybe.

Jason SorokoI try to talk about it in every episode.

Tim CallanDo you want to take 20 seconds and tell us what the many reasons that automation is good.

Jason SorokoEvery certificate should be a managed certificate. This is fundamentally important going forward in how we're going to innovate everything that you want to have happen in your digital life. It's just you cannot be living in the era where you're living in front of a spreadsheet managing your certificates manually anymore. It doesn't work.

Tim CallanAbsolutely right. 100% correct. So why in the world do we need a rule that requires CAs to support automation? There is a rule dictating the maximum time in which you're allowed to respond to a certificate report. There is a rule dictating the maximum time in which you're allowed to respond to a question on Bugzilla. By the way, both of these rules aren't followed. Are routinely not followed. I watch the Bugzilla ones not being followed all the time. There's a rule requiring that CAs find root causes for their incidents and create action items to rectify these root causes. Why do we need that rule? Isn't that called software development? Isn't that called bug fixing? In what crazy, upside down bizarro world do we need any of these rules? And yet we have them all?

Jason SorokoI tell you, Tim, it's, I don't know if you've got more to go, but I got a comment here. Some of these things really come out of the Wild West history of the internet, and it's just it has taken years and years to expunge that kind of thinking out of what is really the basis of trust on the internet.

Tim CallanAnd so, by the way, that wasn't. That was my second to last slide. My last slide is just, all it is is words in the middle of the slide that say, what do we want to be?

And that was what I just wanted to leave that group with. There were a lot of people. There were, I think, 300 people in the room. There were 1000 people online. It was the biggest audience I was ever going to have. And I said, for this audience, what is the one thing I want to say to this audience? And that's what I said to them. Come on. Come on people. Have some pride, have some dignity, have some self-awareness. Recognize that you have been granted a privilege that almost nobody on the planet has been granted, and at least try a little bit to live up to it.

Jason SorokoWell, it's a good message, on CA Day at IETF, I'm glad those things were said. It has to be said. And, thinking about the past, thinking about the way things have been, I think we are turning a page, Tim, and I think that the automation era for certificates will force people to think differently and hopefully everything you said, you know, it's just, it starts to be understood. There's a low bar that we should never, ever go back to.

Tim CallanI mean, the reason all those rules exist is we're forcing CAs to get better with rules and punishment. But why is this a story of crime and punishment? Why isn't this a story of being our better selves and striving for greatness and recognizing how blessed we are in investing just a little bit in doing the right thing.

Jason SorokoI think that is part of the future, Tim, and I think that the adoption of and the promotion and the encouragement of automation and a lot of these themes into the future will allow us to survive the gigantic headwinds that will come from post-quantum cryptography and other things that will truly challenge us.

If we're a complete mess, as an industry, the way, the way that things were, say, 2012 and before going into post-quantum era, we're finished. Like we're gonna have no trust on the internet won't exist.

Tim CallanYes. I think. So, one of the things that we say in this podcast, in various episodes, a little bit here and a little bit there, is, I'll mention that the CA/Browser Forum is tightening up. And it really is. Just in the past five years since I've been with Sectigo, I've watched it tighten up a lot in terms of just clarity and rigor and specificity of what is expected of a CA. There's more removal of kind of individual judgment and individual interpretation to be replaced by specific, codified, proscriptive rules and procedures and this is because of what you said. We need it. And it's not just PQC. It's all kinds of things. It's PQC. It's new, evolving attacks. It's the fact that the digital footprint we have is bigger than ever. It's the incredible interconnectedness of everything. I remember a couple months ago, when you and I had Bruno on most recently, and he talked about how at this point in the days of Y2K if computers had broken, we all would have gotten by. But nowadays, if computers break like we can't. Nobody can do things with pen and pencil anymore. All those procedures have been ripped out. We literally don't own fax machines. And so, those are the forces that are causing CA/Browser Forum to tighten up, and I think it's going to keep tightening up, because I think that's an existential requirement, but I also think that our own members are creating a headwind by failing to live up to their potential, and by failing to just, again, a little bit of introspection and say, I have been granted the privilege to be this incredible thing. I'm a public CA. I'm the only one in my country. I'm one of 50 on the whole globe. And instead of abusing it and taking it for granted and milking it for pocket change, what I'm going to do is do it right and earn that privilege. I think if we saw all of the CAs doing that, we would be better off.

Jason SorokoTim, I'm gonna refer to an old episode, Root Causes 128: What is Total Certificate Agility? I would forgive anybody who hadn't heard it when it came out. It was deep in the pandemic, November 2020. And I would say that that should be a North Star for not just practitioners, people who are actually consumers, subscribers of publicly trusted certificates, but also for what you want to be able to provide as a CA. A lot of these other things that you've talked about, they're worth putting on a slide and being exasperated about because that is the minimum bar, but I think where we really should be at as an industry is what we were talking about over four years ago. Because I'm going to draw this line, Tim. Here's the difference between where I think some people are right now and where I think it’s heading and then I'll let you finish off this podcast.

So like right now, I'm really glad that we're in a place where I don't hear any of the CAs yet, complaining about MPIC. As an example.

The industry kind of came together and said BGP attacks are real. None of us wants to be misissuing due to a BGP attack. We'll do what's necessary and put in the rules around MPIC, and then we're going to implement it. That's the process where all the CAs are in right now, and I think that that's a huge difference from where we were in the past, where I bet you, I'll bet at least $1 and a half, there would have been a CA or two who would have been all whining about that and not implemented. And it would have been bad.

Tim CallanSo I hate to do this, Jay, but there absolutely have been CAs who have been whining about it, who have said, this is a theoretical attack. We've never seen one of these attacks in the real world. This is something that only state actors can do. Why are we spending all of our resources on this? Why are these deadlines so soon? And that's kind of what I'm talking about, and it's because they don't want to make the investment. They find it inconvenient. They want to do other things on their roadmap, or their corporate owners, or their government owners don't deem it to be important, and so you see them prioritizing those other things rather than recognizing.

One of the analogies that I give a lot, and I may have done this on the podcast already, but I think it's appropriate, is if you think of the WebPKI as a wall around a fortress, every CA is a gate in that wall. It's how things go in and out. Every CA is a gate. And to prevent invaders from coming through your wall, you have to prevent invaders from coming through your gate. Now, all of the gates are equally effective for an invader, so you don't have to hit the big, popular gate that everybody has heard of, like Sectigo or, Let's Encrypt, you can go hit the tiny little CA located somewhere in Eastern Europe or Asia that is weak and sloppy and has poor security and still get through the wall. And so that's one of the problems. You've got folks who aren't signed up to be pursuing rigor and security and exactitude at the level that is required to be one of the gates through the wall that is the WebPKI, and yet they still are one and that's where we have our trouble.

Jason SorokoTim, I'm thinking back to a blog post and a presentation, I think was Moxie Marlinspike back in 2011. And he was talking about the idea one, one of the ideas within that blog was about trust agility. In other words, just like you, you gave that analogy that you didn't have to go after one of the big CAs. You just have to go after one of the CAs that was on your popular, available in a very major browser trust store. Just any one of them.

And that's the castle you're talking about. And so what Moxie was talking about this way back in 2011 saying, maybe we should give users the ability to make it easier to switch in and out who they trust. And I think that with enough time having passed, I think that we should be pleased that we're far from where we were in the 2011 time frame in terms of just how CAs have evolved, how the CA/Browser Forum has evolved. Things have gotten so much better. So we're not in that dire circumstance that the world was in back then.

But on the other hand, I think that that idea of trust agility never went anywhere for the same reasons that the green bar has gone out of the browsers. It's just any of these things that you have to have a propeller on your head to understand what it is you're actually doing doesn't really add security. And so I guess what I'm really trying to say, Tim, is the onus really is on the CAs to do the right thing.

Tim CallanFor sure. Like when this whole model was created back in the 1990s there's kind of this built in assumption of, well, look, if you're banking online, you have a certain amount of computer acumen, or you wouldn't be doing this right. And that's probably true in 1997. It's absolutely not true today. And so, in 2011 even, honestly, that feels like a little bit of an outdated viewpoint. And I'm not criticizing Moxie Marlinspike. He is very smart. He's done a lot of great stuff. But that even in 2011 I would probably have pushed back on that and said, I don't know. But certainly by 2024 we can. Like no. The people who create and run these systems, the organizations who create and run these systems, are the organizations that have to get it right, and that's primarily browsers and public CAs.

Jason SorokoThat is the point, Tim. That is the point is, it's the onus is on the CAs and, following procedure and helping to craft that procedure, which is exactly what you're doing right now with as Vice Chair within CA/Browser Forum. Like the participation of the CAs is as important as anybody else's, and that's a big change from the past. I'm interested to see, though, how the system will self-police itself into the near future, where we still have some CAs who are, I would say, are below the line of where they need to be, just in terms of making choices that are between right and wrong.

Tim CallanAnd I do think there's an aspect of that. I mean, I understand that, at the end of the day, business decision making is driven by business pragmatics and I understand that companies only have the resources they have and all that stuff, but I still think there needs to be an attitude shift in a whole lot of public CAs where we say, the only reason we're here, the only reason I am fortunate enough to have this job is because I get over a certain minimum level, and if my CA can't get over that minimum level, then maybe I'll go the way of Entrust or e-commerce monitoring, and maybe I won't be a CA next year or two years from now. And just honestly, be a little more high minded. Try to do it right.

Stay informed with expert insights

Subscribe to Root Causes for engaging discussions on PKI, digital security, and best practices for protecting your organization's critical assets. Don’t miss an episode!