Podcast
Root Causes 71: Short Lived DevOps Certificates


Hosted by
Tim Callan
Chief Compliance Officer
Jason Soroko
Fellow
Original broadcast date
March 6, 2020
Repeat guest and DevOps expert David Colon joins us again to discuss identity for microservices, including the use of very short-lived TLS certificates. David and our hosts explore the unique properties of PKI in these environments and describe how to find the optimal term for a container certificate.
Podcast Transcript
Lightly edited for flow and brevity.
And then the next common thing that people have done is, oh, just generate a SSH key and give us your public key and we'll put it on the server’s authorized keys files. And that's great, too. But when you start working at a bigger company, depending on the tools you have, if configuration management wasn't a tool there, you probably have your own bait bash, or Perl script that hands out SSH keys, and it doesn't happen instantaneously. So, you can have a developer or a sysadmin start day one, and they don't have access to production systems until, you know, their 90th day in. So that's a slow process. And you know, you don't really get the most bang for your buck from your onboarding process for that employee and that also introduces challenges with configuration drift, something that we've talked about before. Another thing that kind of sucks with that approach, and it's similar to the username and password, which I've alluded to before, was the SSH host key fingerprint. When you're using username and password authentication, or if you're using SSH keys as your authentication, there's this, you know, as a client, SSH root at your server, it will say, hey, this is the host key’s fingerprint. Do you trust it? And I brought this up before. I've never met anyone that said, hey, can someone verify this is correct? And even when it is correct, in the concept of immutable infrastructure, that IP or hostname may be the same, but the server actually changed. So, your SSH client might complain saying, hey, there might be a possible man-in-the-middle attack. And that kind of sucks, because it's trying to be helpful but now you've trained a bunch of sysadmins to basically ignore that. So, you never truly know if there's a man-in-the-middle-attack happening.
Now, a question that might come up would be, what is an appropriate time for expiration? Is this something like IoT? Do you want it to be 10 years, right? How long - - no one can predict how long an employee may stay at the company? So, do you rotate every 90 days? What's the friction there? And that's where the next evolutionary thought comes into play, which is this concept of short-lived client certificates and with short-lived client certificates, basically it promotes the idea that has always existed, which is no one should log into a production server, because there is inherent danger, right? A sysadmin can log into a root production server, or log in into a production server as root, may do something as simple as update the binaries, or the packages on that system, but because they upgraded MySQL, let's say four to five, it broke the application and now the business is having to deal with an outage. By using short-lived access tokens, you can create these audit policies to see who has access, you can, you know, finetune it to say what time should people have access to, or maybe mandate that there should be an acknowledge alert before someone asked for that access token? Another good thing - -
And for people that aren't aware, for those of you who, you know, have only lived in this newer world of SSH to get into your servers, in the Microsoft stack of technologies, we're still dealing with probably about a 20-year-old problem of the pass the hash problem, which is where, when you log into a system, even remotely, your log in hash is actually stored on that remote computer. It's on yours. It's on the remote computer. Somebody bad actually comes in, is able to read a specific memory, protected memory of that computer system, they're actually able to then take that hash and then imitate you across the network, across anything where Active Directory is able to, to access. Being a domain controller administrator is kind of the holy grail for the bad guys because once they get a hold of that hash, it's almost unlimited what they can't do, or can or can't do with your entire Active Directory tree. So, it's interesting how old ideas come to play again and I really like the fact that it took people in the Microsoft stack of technologies world, you know, decade or two, in order to think of these things. I'm really glad now, Dave, that, you know, this new world where people are working in cloud systems, in DevOps, working with SSH authentication, these are really strong forms of authentication and I'm really, really glad to hear that the principle of least privileges is now part of the conversation much earlier on. So, congratulations to you and your generation for coming up with this faster.

