Navigating Access Challenges in Kubernetes-Based Infrastructure
Sep 19
Virtual
Register Today
Teleport logoTry For Free
Background image

The Access Control Podcast started in 2021, and this is the milestone 20th episode. Director of Developer Relations at Teleport, Ben Arent, launched the podcast as a chance to hear voices and viewpoints outside of the immediate sphere of Teleport; we’ve had a range of guests from an FBI agent, a compliance manager, hackers and teams building software to protect their infrastructure. So, it was only relevant to celebrate the 20th episode with an opportunity to interview a co-founder of Teleport and CEO: Ev Kontsevoy.

Teleport is the Open Infrastructure Access Platform, started in 2015, with Ev still holding onto the #2 contributor to the open-source GitHub Project. Ev is a serial entrepreneur and was CEO and co-founder of Mailgun, which successfully sold to Rackspace. Prior to Mailgun, Ev held a variety of engineering roles. Ev recently co-authored the O’Reilly book, Identity-Native Infrastructure Access Management: Preventing Breaches by Eliminating Secrets and Adopting Zero Trust. This episode covers many of these topics.

From Orange Book to Identity-Native

  • Access control consists of four technical components: Authentication, Connectivity, Authorization, and Audit.
  • Multics, an advanced operating system, serves as inspiration for Teleport's approach to scaling access control. Multics introduced the concept of a reference monitor as a central point for policy evaluation and enforcement.
  • The Trusted Computer System Evaluation Criteria (TCSEC), known as the Orange Book, set basic requirements for assessing the effectiveness of computer security controls.
  • The CIA triad (Confidentiality, Integrity, and Availability) is presented as the foundation of trustworthiness in computing systems.
  • Teleport provides identity-native infrastructure access to servers, cloud applications, and web applications. Teleport's implementation of zero trust involves technical aspects like reverse tunnels to establish connectivity behind firewalls.
  • The concept of true identity should be differentiated from the common practice of associating identity with electronic records or aliases.
  • The use of shared credentials or shared identities across various systems is a common anti-pattern.
  • The state of authorization in current systems is broken, and it's difficult to synchronize role-based access control (RBAC) rules across different layers of technology.
  • The discussion challenges the current emphasis on visibility and audit logs, suggesting that once authorization is properly solved, the importance of observability will decrease.
  • A collaborative and trust-building approach between security teams and engineers is critical. Security measures should not hinder productivity but should be designed to work seamlessly with the broader computing ecosystem.

Expanding your knowledge on Access Control Podcast: Episode 20 - From Orange Book to Identity-Native

Transcript

Ben: 00:00:23.254 Welcome to Access Control, a podcast providing practical security advice, advice from people who've been there. Each episode, we'll interview a leader in their field and learn best practices and practical tips for securing your org. This is a special episode recorded in person at Teleport Connect 2023. We'll be publishing a video of this conversation at Teleport Connect Virtual Replay on February 8th, 2024. Or if you're listening after the release, you can always catch a replay on Teleport's YouTube channel. For this episode, I'll be chatting with Ev Kontsevoy, CEO of Teleport. Teleport is an open infrastructure access platform started in 2015. And Ev also holds the no.2 contributions on GitHub. Ev is a serial entrepreneur and was co-founder of Mailgun, which he successfully sold to Rackspace. Prior to Mailgun, Ev has held a variety of engineering roles. Ev also co-authored the 2023 O'Reilly book, Identity-Native Infrastructure Access Management, and we'll be covering lots of those topics today. So, Ev, welcome. To kick things off, we're going to deep dive into your book. Can you tell me about the pillars of access?

Defining the pillars of access

Ev: 00:01:26.216 So at the end of the day, I will say that I just don't like the word security. I think it's a little bit kind of disconnected from the actual benefit we want to get from computers. So it all comes down to, are computers trustworthy or not. Because if they're not trustworthy, then they're useless. So I believe that we're in the business of actually delivering trust to computing. So you want your computing system to be trustworthy. So maybe I'm reformatting your question by saying: what are the kind of pillars of trustworthiness? And that's where that CIA triad comes from. So confidentiality, integrity, and availability. So if you have those three things and they're true, and they're not getting violated, so that means your system is trustworthy. Another way of looking at it, so when you say, "Okay, what do we need to do to build a trustworthy system?" So this is what I talked about earlier. So one of the components is that you do need to have an access control system. Because access control is basically about enforcing policy when subjects and objects interact with each other. So subjects — that's applications and people. So they all need to be treated equally. It's really, really important. And objects, that's your data. So every time people and applications interact with data, you need to enforce policy. So that's what the access is. And now we could talk about what are the pillars of access, actually. So from a technical perspective, there are kind of four components that access consists of. That's authentication. People need to log in. That's the process of issuing an ID. So the importance of having identities is really, really important. Also, everything that I'm talking about equally applies to humans, to applications, and also to hardware, which means that applications need to log in, hardware needs to log in. When a server comes online, it actually needs to kind of log in and prove that I am actually a legit server. I'm not a honeypot. And the same thing for humans.

Ev: 00:03:25.578 So that's authentication. So once everyone is authenticated and so now everyone has an identity, now we could move to second component. So connectivity. Connectivity is essentially allowing objects and subjects to interact with each other. Back in the day, connectivity was almost invisible because everything happened inside of a single computer. But you're still establishing connection. Opening a file is a connectivity. And then you're using that file handle to read and write information into that file. So now instead of file handles, we're using socket handles. So everything happens over the network. So connectivity is now more important. This is where that kind of zero-trust component comes from. That zero trust essentially says that only subjects with identities are allowed to speak to other subjects or to access objects. That's the second component, connectivity. Again, authentication, that's the first one. Second is connectivity. So once everyone is connected and once everyone has an identity, and I will keep repeating when everyone and everything, so then it comes down to authorization. What are you allowed to do? So you do have access to this object. Can you delete it? Can you modify it? So what kind of access you actually have? So that's kind of authorization, the third component. And finally, the audit. What is going on? On your computer, you can actually request a list of all open file handles. And you will see like, "Oh, I have these applications that are accessing these particular files." Well, can you do it across your entire AWS account right now? Most people probably say no. That's actually what's broken, what we're trying to fix. That's like an audit. Audit and visibility, there are two sides to it, real-time, show me what's going on, just like I just explained. And also historical, what happened before?

Ev: 00:05:13.075 So wouldn't it be nice if you — let's just say you're operating a massive fleet of servers and you pick a random box and you pick a random file. And can you answer the question where it came from? Who created that file and why it was done? So if you had a multi-slate system controlling your entire cloud infrastructure, you would have been able to say that "Hey, Ev Kontsevoy was doing a deployment three months ago. And as part of the deployment, he ran that script, and that script pulled this file out of GitHub and put it here. So that kind of visibility today is actually missing at most organizations. So that needs to be built. Now I can go back just to give you a compressed answer to your question.

Ben: 00:05:48.149 Yeah.

Ev: 00:05:49.087 Connectivity, authentication, authorization, audit. And that's what the book is about. So it explores the different approaches to implement those four. And, obviously, it suggests what is kind of industry best practices for connectivity, authentication, authorization, and audit.

What Multics is and the problems it aims to solve

Ben: 00:06:05.638 And I think you kind of touched on this briefly. But in the book, you mentioned the concept of systems that have been created in the past, which have kind of solved these problems before, one being Multics. For people who aren't familiar, can you describe what Multics is and what problems it went out to solve?

Ev: 00:06:19.940 First of all, at Teleport, we are great admirers at Multics. So the people who created the — Sasha, our CTO and co-founder, when he was originally — that's where his inspiration is coming from. And it's based on really humble realization that most great ideas — they've been done before. The industry— we're all evolving on a spiral. So in the past, when all of computing was happening within a single box, and we had the idea of an operating system that kind of managed everything. But then we moved forward in time into the present. And now we're operating with hundreds of thousands and millions of machines. So we lost a lot of luxuries that operating systems used to provide. Scheduling, for example, in the operating system — we just give it a process and the process will be executed with other processes. An operating system will manage resources. But eventually, we figured out how to take this capability into the cloud. So now we have systems like Kubernetes. And access control subsystem of an OS, it's one of the things that we lost. And just like what Kubernetes is doing with scheduling, is bringing the concept of scheduling from a single-box operating system to a data center scale. That's the vision of Teleport to take the access control idea from an operating system and scale it up to a data center scale. Right now, we're just focusing on organizations. We want our customers, when they reason about their infrastructure, to actually now start to rely on the fact that you do have functioning access control. But it wouldn't be kind of cool to scale it up to internet scale, where nothing is ever anonymous. Because even if you solve all of these CIA triad problems for your infrastructure using Teleport, you don't want to live on an island. You will have a bunch of microservices that are calling APIs that you purchased from other companies, from companies like Mailgun, Twilio, or AWS. List goes on.

Ev: 00:08:15.658 You now have the exact same access silo problem, but now it exists across organizations. So if I'm thinking about kind of the distant future for Teleport, I'm thinking how we can solve that. So if that becomes a reality, then you will end up in an access control system that covers the entire internet. But I think I'm digressing now. I can talk about Multics for hours. It was actually the most advanced. It's really old. But it's the most advanced operating system ever designed. It was so advanced that most people agreed was too complex. So Linux borrowed some ideas from Multics. Windows borrowed some ideas from Multics, but all present-day operating systems that are kind of less powerful. So for example, when the Department of Defense decided to look into using Linux in production, they realized like, "Oh, it doesn't actually have a robust access control system that Multics used to have." And that's where I see Linux came from because we need to go and borrow more concepts from Multics simply because we do want a system to be trustworthy. That kind of reminds me of something. I saw the title of this podcast. The Orange Book was in there. So if we want our systems to be trustworthy, how do we know if they're trustworthy? So we came up with this idea of CIA triad. When I say “we”, I mean computer community. It's not us. It came from way back. And everyone knows about the CIA triad, but it's not as simple to evaluate, to take a look at a computer and say, "Yeah, this computer has confidentiality. Yes, this one has integrity. This system is available." So the Department of Defense — they published this book called, I think, Trusted Computer System Evaluation Criteria (TCSEC), really, really long work. But it was famously known for being the Orange Book. So if you guys saw the movie — actually, a bunch of movies about hackers, the Orange Book is in them.

Ev: 00:10:08.849 It's basically kind of a manual on how to protect computing systems. And the hackers would read that book to figure out how to hack into them. And the exercise that I think would be kind of fun for most of you to do is to go find a copy of this book. It's pretty hard because I think the latest edition was published in like '76. But the book goes into great detail. It will force you to ask questions about your environments, computing environments, because your computer environments are computers. They are modern computers that we all use. And that book is also going to — it's reminding all of us why we even exist, cybersecurity professionals or security engineers, that we're in the business of making our computers trustworthy. And there is a manual that was written a long time ago by really, really smart people that basically tells you what you need to do. And in fact, you could even look at that book as a Teleport long-term roadmap.

The concept of a reference monitor

Ben: 00:11:04.056 And I think one thing in Multics that we've not touched on is the concept of the reference monitor.

Ev: 00:11:08.634 So in the access control theory, you have to have one central point where your policy is evaluated and enforced. In Multics, that component was called reference monitor, which means that every time you want to establish a connection — and again, connections exist even within a single computer. So every time there are two subjects they want to talk to each other or if one subject talks to another object, all of these connections get routed through a reference monitor. A reference monitor looks at everyone's identity. So nothing and nobody is allowed to be anonymous. And then it kind of checks the policy in a segment where the subject or where — I'm sorry — where the object lives, whether that interaction is allowed. And even though people look around now and say, "Hey, we have this new idea of zero trust. Let's go and become zero-trust organizations. Let's implement zero-trust architecture." But I like to point out, look, it's not a new idea. It was built in Multics many, many years ago. And now we just kind of collectively, as an industry, picking yet another good idea from the past, and we're trying to kind of scale it up and kind of move it into the future. So essentially, zero trust says that in your organization, you need to have a reference monitor. It's something. Now, you could call it a proxy if you want to get really technical, but there are probably other technical ways to implement it. But there is something that basically looks at every connection trying to be established, and it makes sure that, first of all, you're authenticated. Secondly, that's going to be an encrypted connection. And finally, you could even check things like what kind of encryption that is. And you could do things — that's, for example, a Teleport implementation of zero trust. Sometimes you want to establish connectivity into computing systems that are behind firewalls. They're not directly reachable. So you could do things like reverse tunnels. That's what Teleport allows you to do, which allows you to expand the reference monitor implementation outside of a single network. But I think we're getting too technical for the purposes of a general purpose —

Ben: 00:13:07.858 High level. Yeah.

Ev: 00:13:08.547 — security podcast.

What identity means and how it applies to machines

Ben: 00:13:10.110 So can you really explain what identity means as myself? Ev, Ben, we have like my driving license. It's a form of identity. What are some of the forms of identity that's important for humans and machines?

Ev: 00:13:22.080 I was always unhappy with this term that people would say sometimes. My identity got stolen, or your identity can be stolen online. And technically, it's not true. Your identity cannot be stolen because your identity is stored in a physical world. To steal your identity, Ben, I technically need to steal you because if it's not you, that's not really your identity. It's something else. And the kind of early problem with identity that we had, and we still have, is that we apply the term identity to what essentially is an alias. Let's just say you sign up for a Gmail account and it asks you to pick a username and then a password, and we say the combination of those things are your identity now, well, that's not true. That's just an electronic record. That's just an alias that exists in one system. That indeed can be stolen. So if I kind of look over your shoulder and I see what kind of password you picked and I see your email and username, "Okay, I stole your identity." Even though technically, again, it's not true. So how do you implement true identity? And that's the term that we actually used in the book, just to kind of differentiate between traditional understanding of identity and the true identity. So we kept saying true identity over and over. So true identity is based on physical properties of something or someone that you try to identify. So in the case of you, it's your biometrics. It's your fingerprint. It's things you know. It's things you have. So all of that collectively represents Ben or Ev. And in terms of a machine, you also need kind of physical proof that that's a real identity. So as most of you know, servers, for example, they have hardware security modules, HSMs. They live on the motherboard. So that device is absolutely unique. There is no other physical object in the world that is similar. You cannot download it. You cannot upload HSM. It's glued or soldered to your motherboard. So that's the true identity of that machine.

Ev: 00:15:16.610 And that is the difference that book introduces really, really early, where we talk about the importance of establishing these true identities. And establishing identities is a two-step process. Let's just use humans as an example. Let's just say that you come to Teleport. You want to get a job here. So you go through an interview. Then, okay, we give you a job offer. You show up for work for the first time. The first thing — this step is called identity proofing. So you actually need to show up in person. So you probably go to something like HR department, and this is where your identity is validated. And this is where you could actually say that this is me. This is my fingerprint. We're going to give you a laptop, that we're going to register you in the system. So the important point here is that kind of verification, the initial identity proofing stuff. It happens in person. You cannot do it any other way. And then you need to find a way electronically in the future to reference true identity of Ben or Ev, in this case. How do you do it with hardware, for example? Well, let's just say that for you, Ben, to start working in Teleport, we're going to give you a company-issued laptop made by Apple. So laptops, on their motherboards — they have a device called TPM, a trusted platform module. You can think about it as a lightweight version of HSM. There's actually a blog post we wrote about differences between HSM and TPM. You can go check it out. So when laptop shows up, so it also needs to enroll into Teleport organization or your organization. And to do it, it needs to present, "I have my TPM." So in Teleport and a product of Teleport, we do have this component called device ID where you need to enroll each laptop individually. And then what you do, you link Ben's biometrics, your fingerprint, in this case. Using Touch ID, you could pair it to a TPM on a laptop. And now collectively, that is the identity that we could use to do trusted computing at Teleport.

Ev: 00:17:15.640 And the difference between true identity, which is what I just described, and the kind of fake alias identity, is that true identity is not data. I want you to think about it. A true identity is not data. Because it's not data, you cannot steal it. You cannot download or upload a TPM. You cannot upload or download a fingerprint. So that information is stored in a physical world, and that is what it makes it secure. And that's why we call it a true identity. So for those of you who don't know how this kind of pairing of fingerprint to TPM works, because sometimes even engineers ask me this question, "Well, isn't this annoying?" Or "It violates my privacy," they would say, "when I need to contribute my fingerprint to a company." But that's not actually how it works. When you enroll your fingerprint when you're using Teleport, for example, your fingerprint doesn't leave your computer. It doesn't go on a wire into any kind of server. Instead, your fingerprint basically gets stored in a TPM. And it's the hardware itself that does all of this. It's unhackable, at least that's what hardware manufacturers are telling us. So which means your fingerprint, the signature of your fingers gets stored and signed by TPM itself. So that's really what creates this kind of physical, true real-world identity that is not exposed as data to any kind of software. So again, cannot be cloned. Cannot be downloaded. Cannot be uploaded. Cannot be sold on a darknet by a bad Apple employee. That's what true identity is. And it's kind of foundation. You start building the rest of your access control system once you have that foundation.

Anti-patterns observed around connectivity

Ben: 00:19:00.519 Cool. So going on from your recruiting example. So I've come in. I've got fingerprints. Got my laptop. All of this is all enrolled. The next stage is connecting to whatever I have. And I know you talked to lots of people. What are some anti-patterns you see around connectivity, things that people are doing wrong that they shouldn't be doing?

Ev: 00:19:16.778 So you're essentially asking me to over-generalize and then criticize a bunch of people I've never met. I've just acquired a bunch of enemies. So the one thing that I personally never quite subscribe to is some people call it defense in depth. Some people call it kind of security through obscurity. But trying to hide — I don't know — SSH port that a particular computing system responds to and putting it on a less kind of common port. When we do that, what do we actually say? It just means that we're not trusting our own access control system. So we don't even want people to connect to it. So that's one thing that I find a little bit silly. I personally run my own home. Let's call it a small data center. I used to have a rack in a data center, but now I just have my basement. And all of this is kind of open to the world and on the port where it's supposed to respond. And it's just because I use a connectivity solution or a trusted monitor — I'm sorry, reference monitor that is truly trusted. Another entire pattern that is actually really, really common, unfortunately, is basically use of shared credentials, or shared identities. So this comes from a very simple realization that, okay, my infrastructure is complicated. I have — and I, in this case, is hypothetical cybersecurity engineer or DevOps engineer. So I have 100+ different technologies in my environment. Now, every single one of them, it could be Kubernetes, a database, some kind of CI/CD system, some kind of dashboard — there are lots and lots of DevOps tools and technology solutions that we all have to run in our clouds. And they all have a config file and a documentation that says how to configure TLS for this thing. And there's like 100 plus of those. So now, my DevOps team needs to figure out how to configure authenticated connectivity into every little thing. And that is just time-consuming.

Ev: 00:21:15.257 And sometimes people, frankly, just don't even have the expertise. And because, look, we all know how horrible documentation could be. Some of these things is really challenging to figure this out. So then people just at some point like, "You know what? I was just going to create" — I don't know - "a VPC, some kind of private network. And we're going to have this one credential or two or three credentials for kind of different roles. And instead of going through the pain and configuring every single tiny thing, we're just going to set up the perimeter, and we're just going to —" that's really where all the enforcement is going to happen. And obviously, that's anti-zero trust. Because in this case, if that particular thing is compromised, then someone kind of breaks through the perimeter, yeah, so the blast radius will be kind of equal the size of the perimeter. And that's actually what you see in the movies too. In The Matrix or whatever, they always say, "Oh, hackers penetrated the network." So you hear that. That's an entire pattern. In fact, hackers should be totally allowed in the network. The one system that I frequently use as a very common example of a true zero-trust implementation is your phone. Your phone is a computer. It's equivalent to a server like EC2 instance in your AWS account. I would say it's more powerful than some EC2 instances. And it could have been the most powerful EC2 instance landed in 10, 15 years ago. And now think about what happens from Apple perspective. So you have like a billion-plus iPhones in the world. That's a cloud environment. Don't you think? You have a bunch of servers somewhere out there and you're running an application called iOS on every single one. Then you have a new version of iOS and you need to push it to these phones. How is that different from a cloud deployment? We all do deployments like every day. And now notice the interesting thing. Apple runs this system without network protection. Every phone is on a public internet. There is no VPC. There is no firewall. There is no tracing or monitoring. Actually, most DevOps tools are completely absent. And it's actually incredibly resilient.

Ev: 00:23:14.543 And also, every server is managed by an extremely incompetent engineer. Let's put it this way. And it's fine. Has anyone ever heard about an example, someone hacking into an iPhone through an iOS update mechanism? This is zero trust. It's a zero trust, fully implemented. The way it works, your phone has a TPM. So it's not hackable. It only trusts Apple. So it's not zero trust. It's only Apple kind of trust. It establishes a reverse tunnel into Apple data centers, and then it gets updates through that tunnel. So even though you asked me about anti-patterns, I would rather say that everything that's not this is an anti-pattern. So that's how I think organizations need to think about designing their infrastructure. So then one day, they could just open up the VPC or firewall and just imagine that every device you have on your AWS account or in your data center has a public IP address with no firewall protection. And if you feel as competent and confident as Apple, that means you achieved a true zero trust.

What Teleport is and why it’s different

Ben: 00:24:23.487 Yeah, going on a bit of a segue. And this is a great introduction plug for Teleport. Now's your time to introduce it. So for people who are listening, who've never heard of Teleport, can you just give a quick introduction about why you started Teleport the company? What the Teleport product is?

Ev: 00:24:39.388 So first, let me introduce Teleport. So Teleport is an access control system or what we instead now say that we're open access platform for your entire infrastructure. And the goal is to give you a practical way. And when I say give you, it actually applies to everybody because Teleport is open source. This is why we say we are open source platform is to give you a very quick and operational and a tangible, very practical way to implement access. From a connectivity perspective, it's really similar to the cloud of iPhones examples. But it also tries to deliver all of best access control capabilities of Multics into the present that is cloud native and it's compatible with modern kind of DevOps way of doing things. So Teleport is access control for a cloud from kind of Multics' perspective. Now, about the history of the company — how it all started. It was somewhat similar to a history of Slack because Slack started as a gaming company, right? And Slack was an internal tool they built for chat. Similarly, Teleport, the legal name of a company is actually Gravitational, where we wanted to — maybe it would be kind of ambitious thing to say. But we wanted to build a Multics-like system in the beginning. Something that allows you to build a software, package it, and easily distribute it anywhere in the world, and it would run by itself, just like iPhone software, without DevOps teams, hopefully, for years. That system was called Gravity. And the company was named Gravitational after that. Gravity did have customers. The use cases were — how do we deploy applications? For example, think of a fast-food restaurant chain where you have 50,000 locations and every location has some servers in the back. So how do you build a deploy application that works in a scenario like this? So you need something that allows you to have thousands of replicas of exact same software running all over the world.

Ev: 00:26:38.115 This is, again, where analogies with iPhone come from. Or if you think about autonomous driving or autonomous flying platforms, some of those run Kubernetes on every single unit, believe it or not. So how do you build applications that you could deploy and run at this scale where you have, again, thousands and thousands of Kubernetes clusters and instances. And when we were building Gravity, problem number one, I'm a C programmer. I should say program number zero was that we needed an access control system that would allow us to actually to make this overall network trustworthy. So it all goes back into this trustworthiness. And Teleport was built as an access control subcomponent of Gravity. So if you think of kind of Gravity as kind of loose analogy to Multics, then Teleport is the reference monitor that was borrowed from Multics. Needless to say that Gravity — it did okay. Commercially, we had some really large customers. And we donated Gravity to other companies. It's now open source. You can find it on GitHub. But needless to say that Teleport, once it became available, it also happened around the time of COVID hit where a lot of organizations started to rethink how access is implemented and enforced for their infrastructure. And the rest is history. So Teleport became really popular, and all of you discovered it. So thank you for coming. And now I guess it's a success story.

Ben: 00:28:06.391 And you might have mentioned this briefly, but Teleport is an open-source, open-core company. Can you sort of describe why open source is important to Teleport?

Ev: 00:28:15.427 The way it was open source — it was almost accidental. At the moment, we haven't thought about this much. We wanted to hire best engineers available worldwide. And engineers, they enjoy working on open-source software. So that was kind of maybe a regional reason because we wanted the most ambitious engineers to join the company. But also, the founders of Teleport, we actually did not come from a classic traditional cybersecurity background. We came from infrastructure background. Teleport was founded by cloud founders. And for that reason, let's just say we were extremely careful, almost conservative on all things security. And we saw open source as an additional check on a company. I remember in the early days, I think it was either all of Teleport or something about Teleport got posted on Hacker News. And I remember all these programmers who frequently go to that website, they started dissecting Teleport source code line by line, arguing between each other. Is it secure or not? And it felt a little bit terrifying because people are examining the result of your work. But when they collectively agreed that that was the most secure way to get something done, the feeling was extremely reassuring. So that is probably the most important reason I can think of why we decided to stick to open-source model because it keeps us honest. It elevates the bar. Because I think any engineer, when they know that after typing git push, eight billion people will see what you just done, even though, of course, they're not going to look at it right away. But I think it makes all of us to think twice about code we write, about our designs. But it also invites other companies and the community to participate in that process.

Ev: 00:30:11.167 So the value of being open is not just because your code is available, it is also extremely important to be collaborative. So if you look at most projects you're hearing about today, we're talking about machine ID and identity, security, governance, and Teleport access graph, look, all of these things have design documents. And you could see them on GitHub, and you could see discussions. And you could see extremely smart people who work at the best cybersecurity teams in the industry participate. So it allows us to actually build Teleport in the open. So we believe that having thousands of eyes looking at what we're doing. I guess this is how we deal with our realization that you cannot be the smartest person in the world, but you do want to build the smartest product in the world. So that's, I think, why we open source. It also builds trust, I believe. In the opening keynote, there was this — even though it was about management teams. The trust is the foundation of everything. And then the transparency was right there because those two go hand in hand. So we are open source because we want to build trust with our customers, users, and the broader community.

Access patterns to achieve the right access levels

Ben: 00:31:18.952 So going a little bit back to the pillars of access, we talked a little bit about authorization. And we both worked a very large hosting provider back in the day, and we saw some interesting patterns around access to resources, both from support and engineers, and some interesting organizational complexity around those. What are some patterns that you've seen that have worked well, that have given engineers and teams the right amount of access to their systems?

Ev: 00:31:41.162 My view on authorization is that it's still broken today. What day today? October 25th.

Ben: 00:31:46.800 Fifth, yeah.

Ev: 00:31:47.101 2023, authorization is broken. So here's why it's broken. So imagine a table in a MySQL database or Oracle database or SQL server, pick your favorite. Now, can you confidently tell me a list of people who could delete entries in that table on 2:00 PM tomorrow? Just think about it. Is it a hard question to answer? Is it an easy question to answer? And then if you can produce the answer, how confident are you that that's true? And here's why it's really, really hard. So let's just say you can go into that database and you could examine different roles and permissions for users that exist in that database. But the scope of those users is limited just to that database, which means that even if you configured RBAC properly, I could go through SSH into that box and I could do SQL dump. I could take it out. I could modify that table, and then I could insert it back. And none of this will be detected by authorization engine of a database because I've done it using completely different pathway. But it's not just SSH versus database. Because if you're running that database in a Kubernetes, I could get into the pod using Kubernetes API. And if you're doing it in the cloud, I could do EBS volume dump using EC2 API. I could use AWS console. You see all of these things — they also have authentication. They have users. They have passwords and whatnot. But authorization engines are different everywhere. How common it is for organizations to be so well-organized that the role-based access control rules are fully synchronized across all these different layers of technology. And by the way, there are more layers that I could have mentioned. And the thing is — almost no one does this today. So that's why we believe that authorization is broken. So authorization today is extremely siloed.

Ev: 00:33:41.256 Here's another example. You could think almost every company wants to implement a rule that says developers must never read production data. Kind of makes sense. You don't want random Googler to read your Gmail, okay? Now think about what it actually means to implement that rule. How many different authorization engines, and how many different layers of your technology stack you need to go and update, and update properly, for that rule to be enforced? I consider this to be the next biggest problem we need to solve with authorization. We could say that for authentication, we almost solved this problem by introducing SSO. You have single sign-on that allows you to login once, and then you have access to a bunch of systems. And now look at Okta. A huge successful company because they're an SSO provider. My question is: where is SSO for authorization? Where is it that I can go and define that I'm an intern and I should only access things and have privileges that interns are allowed in this company? We have not solved this problem today. And it's one of the biggest reasons why the frequency of those breaches goes up and up. Because effectively, it means that in almost every hack, the blast radius is enormous because we continue to run — and when I say we, I mean like all of us collectively, the industry — we continue to run a bunch of workloads that essentially are misconfigured. I'm not claiming that Teleport solved this problem. We're working on it. So there's a research project that we've been investing into for almost two years now called Teleport Access Graph (TAG). That's our attempt at solving it. I'm hoping that we're going to start actually making pieces of it available gradually in 2024. But I invite everyone to visit our presentation by our CTO about TAG. At least you will kind of see what approach we're trying to take.

Tips to reduce the possibility of attack on infrastructure

Ben: 00:35:36.832 So I'm going to follow up with two questions. I'm going to touch a little bit on audit since it's last. It skips the agenda. So I know CrowdStrike reported that it can take 84 minutes between a credential being leaked and a breach, which seems like a very short amount of time. What are some tips for teams to sort of get ahead of this to reduce the possibility of attacks on their infrastructure?

Ev: 00:36:02.412 We obviously talked about zero trust before. So the zero-trust architecture basically means that the blast radius will be hopefully contained to a single workload that got infected. So that thing alone, if you are a true zero-trust company — using iPhone is an example. If one iPhone somehow got breached, maybe because I'm just going to yank it out of your hand when it's unlocked, I cannot infect other iPhones from this one. It's completely isolated. That's zero trust. No iPhone trusting the other iPhone. So if your infrastructure is like that, then you largely solved this problem earlier. Which kind of brings me to — I'm about to say something really controversial that might even ruin my career someday. I believe that industry right now pays an oversized attention to visibility and audit login. And the reason is because authorization is broken. So think about it this way. What is policy? Policy is a set of rules that need to be obeyed. They need to be true if you don't want your data to be misused. If you want your system to be trustworthy, your policy must be enforced. But you can also think of a policy as a program code. It's a declarative programming language on some level. Now, the question is, how can you validate your policy? How can you write unit tests that says that your policy is correct? And because policy today is fragmented, incredibly fragmented, because different policy engines or authorization engines, they're not compatible with each other. And they are, let's just say, at different levels of maturity. So because your authorization is broken, a lot of cybersecurity professionals simply say, "Hey, I know that my infrastructure has authorization problems. So, therefore, I need to assume that I always have bad actors on my network. And observability is how I'm going to catch them."

Ev: 00:37:59.160 This is why a lot of people are investing into anomaly detection, into SIEM systems, because that's actually the last line of defense. If you couldn't prevent the bad actor from entering your infrastructure, at least you can try to catch them before 84 minutes expire. And here's my controversial take, that I believe that once we actually build a fully functioning access control system that solves this unsolved authorization problem, then the importance of observability will go down. I'm not saying it's going to be useless or obsolete. But I do think we will have a more balanced view on cybersecurity, where bad actors will become an extremely rare event on your infrastructure.

One practical tip to secure infrastructure

Ben: 00:38:46.426 Great. And just to close it out, I would then like to ask the guests one just practical tip that they can deploy today or this week to secure their infrastructure.

Ev: 00:38:56.282 I'm really bad at over-generalizing. But I do want to return to this theme that at the end of the day, it's all about people. It's not hard to build a very secure system. Just cut access to everyone and everything. It will sit there completely unused. It will be super secure. But people won't be able to get anything done. And it also applies to the pattern of adopting secure technologies. I'm not sure how well-known that fact is. But it's actually quite common for engineers to build backdoors into the infrastructure of companies they work at. And they're doing it on purpose. And we brag about it to each other when we go to parties. You probably all heard about it. You might meet an engineer from company X and you ask him, "What do you guys do for access control to your cloud?" And they will say, "Oh, we use this vendor and that vendor. But I built a little proxy here on the side because when the latency is up and my Cassandra is dropping packets, I don't have time to go through all this kind of official bullshit. So I have this thing that I use to kind of fix the problem." It's really common. This happens because cybersecurity vendors build solutions that just get in the way and preventing people from being productive. This also happens because cybersecurity professionals who buy these solutions and set things up, they only care about this security thing in isolation from everything else. And that's the practical. I'm not sure how practical this tip is, that just put security aside for a second and let's just all remember that we're in the business of doing computing, which is this beautiful dance of hardware, software, and people. And we are in the business of making it trustworthy. We all need to trust each other. By making things closed, hard to use, making things get in the way, you're destroying trust. You're making engineers distrust security teams who work in the same company.

Ev: 00:40:50.379 We actually see it even when we go into an organization to have a conversation about using Teleport over there. It's very common to feel the animosity that exists in the room between engineering and security. Because people feel that security gets in the way. It prevents them from being productive. So let's just be nice to each other and not do that and design systems to be trustworthy and nice to use.

Ben: 00:41:15.877 Thank you, Ev. [music]

Background image

Try Teleport today

In the cloud, self-hosted, or open source
Get StartedView developer docs