Podcast
Root Causes 52: New TLS Certificate Incident Research


Hosted by
Tim Callan
Chief Compliance Officer
Jason Soroko
Fellow
Original broadcast date
November 22, 2019
New research out of Indiana University Bloomington reviews nearly 400 "incidents" with public SSL certificates over the course of more than a decade. Join us as we go through the main findings from this piece of original research, including methodology, incident types and causes, and rogue certificates.
Podcast Transcript
Lightly edited for flow and brevity.
So, let's break down what they are real quick. So, I mentioned the first one. Most popular, the percent fields with certificate are not compliant to baseline requirements 38.5%. Next down at 10.3% is non-BR compliant or problematic OCSP responder or CRL. So, OCSP and CRL problems constitute 10.3% of the incidents, the reported incidents, right? That's high, but maybe not - - it's again, it's not a miss issued cert, right?
Repeated/lacking appropriate entropy serial numbers 5.8%. So, we talked about a major entropy problem that occurred earlier this year with serial numbers where a lot of CAs had 63-bit rather than 64-bit serial numbers. We did a whole episode on that and I wonder if that boosted the number on this one.
Undisclosed sub-CA, 5%. Surprisingly high to me. I didn't perceive undisclosed sub-CAs to be a significant problem. But they feel that there are a total of 19 incidents, which is more than I would have thought.
512/1024-bits keys was 4.75%. So, people using keys that are not strong enough. It's the kind of thing you'd expect to be seeing.
Possible issuance of rogue certificate. Possible issuance of rogue certificates 4.75% as opposed to rogue certificate which you referenced earlier Jay, which is 3.17%. So, they broke out known rogue certificates and possible rogue certificates as different kinds of incidents.
Then we've got use of SHA-1 or MD5 hashing algorithm is at almost 4%, 3.96? And presumably, I would bet you if we were looking at the details that those would be older incidents, right. Probably happened shortly after the deprecation of those algorithms. I bet you nobody's doing that today.
I think there's one other thing I'd love to hit before we leave, which is the causes. So, again, they have a chart. This is their Table 11. And again, I'm just going to run down the charts. It’s a little shorter than the other one. And we'll talk about how they categorize the causes.
So, the first cause is software bugs and the percentage of incidents that they attribute to software bugs is 24%. So that, you know, that makes sense to me. These are automated systems; they're running with a lot of software and if there's a software error, you know, where it's inputting the wrong value then that could be perpetuated across certificates before it gets discovered and fixed. And you know, software has bugs, it happens.
The second one is interesting. Believed to be compliant/misinterpretations /unaware. They have an 18.2%. So, this is just CAs not interpreting requirements correctly. Doing something that they believe is compliant and other people disagree and presumably if it's on this list, the other opinion won out in the long run.
The third one, business model. This is interesting. Business model/CA decision/testing, 13.7% and they go into details on this. And, this is, they talk about how a CA is business model could be in opposition to the overall public trust, right? If you're in the business of selling certificates to skeevy people then skeeviness is rewarded. And so, that, you know, that—they feel - - these researchers feel that nearly 14%, nearly one in seven of the incidents belongs to that reason. So, wow, that's kind of high. That's higher than I wanted it to be and higher than I imagined it would have been. So that was a takeaway for me.
Human Error 9.8% and of course, humans have, you know, especially if you're going back to 2008 at that kind of timeframe these were very human intensive processes. I think over the years they become much more software automated and I would hope we would see the human error number go down just because we can solve those problems with computers.
Operational error. I am not sure exactly what the definition of that is, but 7.6%.
Non-optimal request check, 6.3%. So, I think that means the actual authentication, the process of authenticating the identity is not, it’s maybe not performed as well as it could have been. Someone makes a mistake. They've got it that at 6.3%.
Improper security controls, 4%. So again, be interesting to dig into that and see what that is.
Change in baseline requirements 1.85%. So, BRs get changed and CAs don't get the memo or don't change successfully or don't change quick enough. And that winds up accounting for 2% of the incidents that are on the list. So that was where, again, it's important to stay compliant and current but, you know, that's probably more forgivable than a business model decision.
Infrastructure problem, 1.6%.
Organizational constraints, 1.6% What's that? Not enough resourcing? Not the right language skills? Not the right cultural knowledge? Something like that.
Other, 2.1% and no data 9.2%. So, for 9.2% of them they didn't feel like they could answer why the incident happened. But that's the breakdown and so again software bugs is the biggest - - differing interpretations of the BRs is the second biggest in between those two that accounts for almost 50% of the incidents.

