Podcast
Root Causes 16: PKI for DevOps Environments


Hosted by
Tim Callan
Chief Compliance Officer
Jason Soroko
Fellow
Original broadcast date
May 14, 2019
DevOps as a software development and deployment methodology has radically transformed enterprise computing. This approach brings with it new architectures and tools such as containerization, Kubernetes, and multi-cloud. Learn how PKI plays a critical role in DevOps environments and how enterprises can best use certificates to keep their platforms safe.
Podcast Transcript
Lightly edited for flow and brevity.
Probably one of the main goals here with DevOps is trying to create a fast and stable workflow between the two groups because right now there’s a lot of silo between the two and the way things had been traditionally done, you know with waterfall methods of software development and old ways of doing things, that maybe kind of worked fine in the old monolithic coding days. But today with some of the burgeoning technologies such as cloud, how to get your applications up to the cloud quicker, with less bugs? How do you get a faster rate of actually going from the point of conceiving something, getting it out there and live, and then being able to rapidly iterate it and change it? The people who run operations and the people who run development are going to need to hold hands a lot tighter.
There are maybe four principles to get through this idea. One of the main ones here, is this idea of everything as code. Probably one of the biggest problems with the amount of collaboration between developers and operations is just the sheer amount of manual work that had to happen.
In other words, ask yourself this simple question. If I gave you an operations person who is monitoring a cloud service that you just built, and a developer who is maybe building patches, what are the chances on any given Monday morning that those two people will have identically the same system if they both worked manually to build the Linux distribution and all the dependencies and all the things that make up a computing system to make it go?
The chances are pretty slim, right?
So there are lighter weight forms of that. We’ll get into that technology in a moment. That’s containerization, and we’re going to talk about the security of containerization a little bit later in the podcast. But let’s talk about this whole idea of DevOps again, and maybe four ideas behind this.
Everything as code: a really important concept within DevOps. Remember the scenario I gave you where you have two different people both trying to build their own infrastructure to be able to match each other. The chances that every single configuration within those things is going to be identical is pretty slim. Therefore, the entire infrastructure should be codified in some sort of declarative specification.
This means that standing up infrastructure really shouldn’t be done by hand. It should be codified for consistency into a type of template that can be repeated often. Because there’s lots and lots of tools for that now. You’ve probably heard of Chef and Puppet, and they all have their strengths, they all have their reasons for why you’d use one or the other. But this idea of codifying things to stand up, especially because you’re constantly going to be bringing up a cloud infrastructure, you’re constantly going to be bringing it down and you want the ability for those things to happen very, very consistently and have the results be the same all the time.
I think with the cloud though, you now have all kinds of different Linux distributions and the definition of a distribution is that every single Linux distribution is bundled with a different set of code, a different set of dependencies, a different stack of this and that, all the way from the GUI on down to some of the nuts and bolts.
So in terms of the “everything as code” idea, keep in mind that not only do you want to have this declarative specification for how to very easily have a consistent platform all the time, but you also want to, as always, have your source code that is basically controlled and something like GIT. For the same reasons you might have done before the DevOps days, you also want to do your code that’s tested in some sort of a quality assurance program, some sort of a pipeline process to make sure that it passes muster.
But there’s another, probably newer idea that we’ll call immutability. Tim, remember when I said for your infrastructure perhaps in the past you set up a Windows server, and that thing probably stayed up for a very long time, and you just made changes to it kind of by hand. You might have had development pass over patches to the operations team and the operations team might have applied those patches as time went on but the server never really changed. I think the idea of immutability, there’s probably several ways to define it, but I really like the idea that infrastructure should be considered disposable.
What you gain by that is that it avoids the infrastructure being patched to some level that’s inconsistent with another. In other words, your Q&A systems, your developer systems, and the systems that might be used as some sort of a test server by operations should all be pretty much identical, and all of them should be considered disposable. You should never just have one server that lives forever and it’s kind of the de facto gold standard that everybody needs figure out how to match. Everything is immutable.
Let’s talk about one of the underlying technologies that’s really helping out and it’s having a real renaissance right now because of how important it is.
We talked about virtualization a little earlier, and obviously that was a way of taking monolithic pieces of software and running them in isolated, virtual machines. And that’s been fantastic. It still works to this day. People use it to this day, all kinds of usages for it, but what happens if you no longer have a problem standing up all kinds of small servers in your cloud environment, for example, or even in your own private rack space?
One problem you still have though is, you have all these different distributions of Linux out there and you don’t want your code to be distributed in such a way where the operations people have to worry about the dependencies. Software obviously has its dependencies and the servers have all kinds of different starting points. You want to be able from just about any starting point to get up and running and have the discrete bit of logic just do its thing.
This is where the concept of containerization comes from. Now most people think that containerization is a form of virtualizing. I think that’s where a lot of people get into trouble. Because it’s not a VM. A container is really about bundling a discrete piece of logic, its code essentially, along with its dependencies. That’s probably one of the most important concepts you can understand, if you really want to understand containerization.
The question then becomes, “Well why do I need it? We already have virtualization.” Virtualization is isolating an entire operating system, each instance hosted with a hypervisor. Hosters are actually run within a container engine, posting engines. You might have heard of Kubernetes, which is actually I think is derived from the Greek word for orchestration, which we’ll get into in a moment. But keep in mind that containers are much more lightweight than VMs and much less isolated from the underlying operating system.
If you want to understand containers from the highest level, they’re really lightweight ways of just bundling together code, along with its dependencies, and containers really should not be thought about like a VM because of the less amount of isolation you have from the underlying operating system.
Obviously, there’s problems. Check out any of your favorite hacker conferences and there will be examples of people finding holes in various hypervisors. But suffice to say that jumping out of a hypervisor is not something that the average script kiddie is going to do on a Saturday afternoon.
But the problem is, think about all these different discrete pieces of code, which have their own interconnects. They connect to a database. They have their own user connections, human based authentication. One container might call another, not just within its own cloud but to another cloud. The multi-cloud containers. Any time you’re doing any kind of reaching out and touching of anything, you’re traversing network boundaries that are no longer as clear and secure as you might have remembered back in the old monolithic days.
A single piece of software, if you want to call it that, a whole solution might be calling all kinds of containers, might be using other people’s containers. It just goes on and on. In fact, this is the worst kind of spaghetti logic potential that perhaps we’ve ever had.
Maybe other people could argue otherwise but I think that though, Tim, the reason why we don’t have to worry too much about it being spaghetti code is the amount of benefit that we’re getting from isolating discrete bits of code and hosting it on the cloud. There is this new way of thinking, especially with the DevOps cultures now that we’re starting to see being developed. It’s all a very, very big net positive.
There is one big potential net negative. The orchestration engines that are actually helping to curtail the spaghetti potential and do things really well like handle networking definitions and all the things that are necessary to make sure that lots and lots of containers work together well, one of the things that they’re actually running are Certificate Authorities. Because of the fact that there is such a big need for TLS certificates for things such as mutual TLS authenticated sessions to other APIs or discrete pieces of logic. You might even need, if your application happens to be a web application where your SSL certificate is actually being provisioned within that logic when this immutable disposable infrastructure is brought up and brought down. That’s a lot of certs all of a sudden.
Think back in the old days of “Geez, I just need one SSL cert and it’s going to sit there for a year, two years. This particular application goes off and makes an API call to something. I’ve got a TLS certificate that I provisioned a lot time ago. I’ve got it written down in a spreadsheet so when that thing expires, I’ll just go handle it.” Multiply that by, I don’t know, pick a large number in your mind. All of a sudden it is unmanageable and some of the Certificate Authorities we’re talking about, I think the majority of them are self-signed CAs that are just sitting there in not terribly well-protected premises if you just want to call it that.
As PKI guy, I just shake my head. I love this technology. It is the future, but when it comes down to just purely the TLS certificate management part of it, I don’t think this has been completely fully thought out yet.
Things go out and one question we haven’t even asked too much yet, because we think the answer is perhaps too obvious, but what happens if a CA gets compromised? The answer to all these questions is very, very bad things.
Well that’s great. It’s still a self-signing certificate. It still has its issues, but at least you have some kind of management system that’s helping you out, that perhaps is part of the infrastructure that you happen to be using with that one cloud. What happens if your CIO tomorrow says, “AWS is too expensive today. I want to rip it out and put it onto some other cloud tomorrow, and then next week I want to bring it in-house, and oh, by the way the week after that…”
You might want to think about setting up a CA with people who actually understand how to run a proper CA and know how to protect it and know how to make it reliable and all those things that you don’t have if you just do it by yourself.
The ability to have a single root for all your applications, the ability for the root and subordinate private keys to be protected in HSM, the ability to have multiple Kubernetes clusters that are actually rooted in a single place or whatever other trust model that you happen to need, those are the kinds of things that you need to go to a trusted third-party CA.
You’re not going to be able to do that yourself, and typically the people writing tools, like the vault tools that you’ve all read about, that’s not what they’re experts in doing. What most vault tools are really trying to do is things such as, “You’ve got static credentials for your MySQL database, and behind the scenes with your Mongo database, whatever it happens to be, I need to automatically log into those systems from a headless discrete logic system.” I am not going to log into that myself, so I need to pull it from somewhere, and I need to pull it from somewhere securely. Those vault systems do that very, very well.
But as soon as you get into the world of PKI, the complexity of the trust model, the complexity of rotating those certificates, and – here’s another concept – your OCSP responders, once you start having large complex enterprise level applications that are within the containerized systems, are you going to do revocation checks on those certificates that you’ve actually issued? Well you know, a modern trusted third-party CA will also be able to set up OCSP responders for you, let you have that kind of capability. Those are powerful PKI tools that you’re just not going to have if you’re setting up your self-signed Certificate Authority.

