iDevNews: RBAC for SSH and Kubernetes Access
RBAC for SSH and Kubernetes Access with Teleport - Overview
Enterprises are best served by leveraging an RBAC system to manage access to their SSH and Kubernetes resources. With Teleport, an open source software, employers are able to provide granular access controls to developers based on the access they need and when they need it. This makes it possible for employers to maintain secure access without getting in the way of their developers’ daily operations. Steven Martin, Solution Engineer at Teleport, demonstrates how to assign access to developers and SRE’s across environments with Teleport through roles mapped from enterprises’ identity providers or SSOs.
Key Topics on RBAC for SSH and Kubernetes Access:
- Teleport is open source, written in Go, by engineers for engineers,
- Teleport, fully compatible with SSH and your existing tooling, provides security that doesn't get in the way.
- Every user in Teleport is always assigned a set of roles. One can think of them as "SSH Roles".
- The open source edition of Teleport automatically assigns every user to the built-in
admin
role, but Teleport Enterprise allows administrators to define their own roles with far greater control over the user permissions. - Teleport, with RBAC and PAM integration, offers flexibility and easy tracking.
- RBAC is almost always used in conjunction with Single Sign-On, but it also works with users stored in Teleport's internal database.
- RBAC gives Teleport admins more granular access controls as well as flexibility to control how users access resources, what they're allowed or denied, and how to audit it.
Expanding Your Knowledge on RBAC for SSH and Kubernetes Access
- How to use Teleport RBAC with OpenSSH servers
- SSH into your AWS infrastructure Using GitHub for RBAC
- Applying the Principles of Zero Trust to SSH
Learn More About RBAC for SSH and Kubernetes Access
Introduction - RBAC for SSH and Kubernetes Access
(The transcript of the session)
Vance: And welcome to the session for Teleport here at today's Application Architecture Summit. Let me introduce you to our speaker for today's session, Steven Martin, solutions engineer at Teleport. Steven, welcome.
Steven: Thank you. Glad to be here.
Vance: We're glad to have Steven with us here too. He brings more than 20 years of enterprise IT experience to his post at Teleport. And notably, for today's session, Steven has a great focus on application architecture. He is an expert in large enterprise application implementations for on-premise and cloud across corporate and government customers. He brings his expertise to us in a session entitled “Role-Based Access Control for SSH and Kubernetes Access with Teleport”. In today's session, we're going to learn why enterprises are best-served by leveraging in our back system to manage access to their SSH and Kubernetes resources. We'll also learn about Teleport, an open source software that provides granular access control to developers based on their specific needs. And Steven will also show us a great demo so we can see in action how companies use Teleport’s RBAC best practices which come out of the box, frankly, to easily let employees maintain secure access to what they need when they need it without getting in the way of developer daily operations. Just a quick reminder before I hand it to Steven. You can download the slides today. Just hit that big red button under the “View screen”. We've also got some great whitepapers and other assets available for you, so just click any of the links that hit your fancy. You can get them all without any extra registration. You did that to join us today. And we encourage your questions, so just type in the “Submit a question” box, and we'll get them right to Steven. So Steven, with that, let me turn it to you. And tell us about role-based access control for SSH and Kubernetes access with Teleport.
Steven: Thanks again. Yeah. So let's get started taking a look at our agenda. We're going to start with a brief overview of Teleport, as probably some of you are not aware of the solution. Or you maybe have heard of it but haven't seen it in action. So we'll go over that, as well as a brief demo. We'll be comparing role-based access to other methods because role-based is not the only way to manage access control and why this is one of the better methods to use for these type of access control. We'll then be going to how do you define and use roles within Teleport. How is that done? How does it provide you that granular access as well as flexibility for you to control how users access your resources, what they're allowed, what they are denied, and then how to audit it. And lastly, we'll go into a more detailed demonstration with using multiple configured roles so you can see more real-life type situation where we go from development to production, troubleshooting, getting into resources, but really using a least-privileged approach to make sure that we're not providing too much access and that we're also making it a seamless experience for your users.
Brief Overview of Teleport
Steven: So here, we're going to go into a brief overview of Teleport. Now, you can see here — it's a high-level deployment example. Teleport is a simple secure access solution that provides access to SSH and Kubernetes machines. You can see there are some main components, the auth service in the middle. That is what maintains the certificate authority, as well as the audit log. One of the unique things about Teleport is that it uses short-lived certificates instead of static keys to maintain access. So Teleport is its own certificate authority. Now, the Auth Gateway, that is how users interact with the system and also is what manages the access to SSH as well Kubernetes. So users will interact with web, command lines like kubectl, tsh. Now, because we have such native support for open SSH, you can still use the SSH client in your machine just as you might've already been doing for years now and probably a couple of decades for some people. But again, instead of using static keys, you're issued that short-lived certificate along with whatever roles are assigned to you.
Steven: Now, once you have that certificate, you continue to go through the Auth Gateway that allows you to connect to those resources. Or it denies you if you do not have the appropriate access. Now, in addition to having a local user base that you can maintain, it also works with your single sign-on identity manager, such as Azure Active Directory, Auth0, G Suite, any of the popular SSOs. We support OIDC and SAML integration, as well as GitHub. That allows you to authenticate the users, pass in the roles. And you can also send in other metadata, such as their usernames and other tags that are helpful in configuring roles and making sure that you're controlling it. Ask your identity manager what they can access and not just within Teleport. Now, in addition to assigning roles by default from the identity manager, you also have the ability through our approval workflow to provide more access on requests. So a user that's a developer, day to day, he needs developer resources. But in some cases, he actually needs some additional access, such as to staging or production or to specific machines. And with that, he can use the approval workflow. And you can see the top, where an administrator can approve or deny. In this case, you see the Slack icon. And we support a number of plugins to different operational systems like Slack, Mattermost, PagerDuty, GitLab. And that allows you to interact with the system and approve or deny. And that'll be part of our demonstration, where you see someone who needs to get into production requesting that and are approved or denied.
Demonstration with Multiple Configured Roles
Steven: So before I go further in more of the configuration architecture, let's see an example of what Teleport is and how you interact with it. So in this case, we're in Teleport. You can see my user. I'm connected to a number of clusters. I'm at the root cluster. And Teleport supports connecting multiple Teleport clusters together. That's a good way that, if you have separate sets of nodes across different clouds or Edge, that allows you to connect all of them together under a single umbrella and control how that access is maintained. If I go into the cluster, I can then see a set of nodes. Each of those nodes have specific labels, which indicate both information, as well as it can be used for security, such as what tier they are. In this case, we're using a dev staging and prod type of tier to control our access as well as the type. Now, I also see the addresses of those specific nodes, and that tells me how I'm connecting to it. In this case, I'm connecting through this [inaudible] port. And in another case, I'm using tunneling, which is very helpful when you're behind a net or perhaps connecting to an IoT device. Now, from here, I can select which user I'd like to log in with to open a SSH session.
Steven: Once I open, it's just like being in a regular SSH sessions that you've done from the command line. I can see what Python program is running. I also have some information in terms of what is the session I'm in. I can see who I'm logged in as, my [inaudible] user ID, as well as the session IDs. Now from here, within this same window, I can open up other sessions, connect to different clusters, and open up similar user IDs. When I'm ready to exit, just type exit as normal and close. Now going back, everything I did there was recorded both in the audit log and the sessions. So if we go to look at the audit log, we can see my usage here, when I've logged in, when the session has ended. And here is the recording that we can replay. Now, in addition to that SSH interactivity, we also have the configuration of Teleport, which we'll be going back into more detail later. But you can see here an example of a dev role, someone who has access to the dev-labeled nodes, as well as the type, and then what roles that user can also request.
RBAC vs Other Methods for Securing Access in Your Enterprise
Steven: So let's continue on to get more detail on role-based access control. So role-based access is primarily defining roles that fit to particular jobs or functions. And that's allowing you to do it in a central place to finding them. Now, the other types of access control are typically discretionary and mandatory. Discretionary is more when you want the owners of particular resources to manage who can access and how. Can you change them? Can you read them? Can you delete? Often, the challenge with direct discretionary, which is often associated with Windows-type resources, is the audibility and the tracking of it because now, you could have thousands or even millions of resources that you're not quite clear how they're configured. And it can be difficult to track that many versus a more centralized approach. The other item most used in very secure systems is mandatory access control, so limiting — you have the right clearance. Are you in the right classification?
Steven: Now, the complexity typically in mandatory is that there can be so many classifications, and you have to keep publishing and updating across all of your resources. One of the big pluses on role-based access is that it's able to incorporate some parts of discretionary and mandatory. So in the case of connecting to an SSH node — because Teleport, with our role-based access and our PAM integration, the PAM integration could actually reject if you don't form a match to a particular role. So there's a lot of flexibility there where, yes, it is centralized. It is role-based. But you can use some instances of mandatory. And that's what typically happens with these kind of systems, where when you use role-based, it's centrally managed. You have it. But you will have some cases where you'd want to incorporate parts of the other types of access controls.
Defining and Using Roles Within Teleport
Steven: Let's get into some details on how we configure roles in Teleport. The first part is around the allowing and denying of accessing. And by default, everything is denied. And then you're typically mostly defining what's allowed, what SSH nodes they can get into, what logins they can use, what Kubernetes they can access. Now, another part is, as I said before, you have your developer role. Well, sometimes, I need to go to staging and I and need to get to prod. It's not a role that I get by default. But I want to be able to find that I can request those. Next is the node label. So as you saw, we had those labeled dev, staging, and prod. So we want to define the nodes that they're allowed to access and label them within the role that will then match the nodes. And also, the login, what users I'd go in. I went in as app user. Do you want to define app user? Do you want to define Ubuntu EC2 user? Or you can also — using the metadata of the user, you can write in the actual login of the user. Like in my case, Steven could be available. And that allows you to uniquely identify the user versus using more common users.
Steven: Another important part is around your Kubernetes configuration, saying what groups and users that are allowed to connect to the cluster [inaudible]. Teleport, and we're very into Kubernetes in terms of its usage from kubectl and managing it. We support impersonating groups or individual users. And in the configuration, you specify one or more Kubernetes groups or users, and then you're allowed to impersonate all going through our proxy. One of the important configurations on role-based access is, well, how long should a session last. And that's called time to live. So often, that could be a day's work. It could be eight hours. It could be 12 hours, whatever is appropriate for that. In cases where it's a more privilege set of role and the data or the systems are sensitive, you may reduce that. So you could bring that down to, say, an hour or even minutes. Teleport has a lot of flexibility in terms of configuring that. It will use, by default, the lowest TTL across all of your roles.
Steven: Now lastly, as you saw, there are a number of features within Teleport that you can configure. And that's part of the role configurations as well, is you configure who can access, whether it's just being able to view it or being able to actually modify the information. So that's, again, part of that role configuration. So let's go through from a simple type role configuration to a more advanced. So first, you have an example here where we have a kubernetes_group of “developer” that can log in with the ubuntu user. Any nodes with developer and just some other settings, like the TTL of eight hours. So that's literally, as it's called, a simple role. Now, going beyond that, we now want to look at, okay, well, this developer, on occasion, does need to get into production. So we're going to set our requested roles for production. That's in the middle here. Now, going a little bit further, we have some cases where, well, I want to be able to set from my single sign-on to populate into the role. So that's where we have these metatags of external. So we [use the command] {{external.k8sgroups}}
, so that'll automatically populate either an individual item or a list of items into this role. We also have the logins. As I mentioned, Steven could come in. So if Steven is in the users lists in my single sign-on, that will automatically populate as a login.
Steven: Now there's also the no labels. So instead of having type dev, I could put generic types from the SSO. That controls what types I'm allowed access to. Now, you'll also notice that I've now added a deny. Before we had no deny settings. We said, well, even if you have external.users, then [inaudible]." We're going to deny that because denies take the precedence. And that's one item — depending on your system, you may not want to allow root. You may want to not by default allow that, even if it's set in your SSO. Now lastly, and we've gone through our most detailed one, is now we're setting, well, what can they access in Teleport. So they'll be able to view their roles. And that's a best practice. Just like when you're in a Linux system, you want to know, "What are the groups here? What are my privileges? What is the access available for these resources?" Within Teleport, you can see a list of the roles and understand, "Okay. This is why I'm only seeing these type of nodes. This is why I'm in this particular Kubernetes group," so allowing the user to see that.
Steven: Next is them being able to look at sessions. What active sessions are there? That's setting that resource and also setting the event. Can they review the event log and see when people log in and came in. And in some cases, you may not want them to have that. There may be activity that you don't necessarily want to share with that, but that's your option. In terms of an administrative role, this would also have create, update, and delete. But typically, for a developer, you would not provide that. Now looking in terms of how our single sign-on interacts with our roles, whether this is Active Directory or Auth0, you would typically start with a user. And they have a set of groups of roles they map too. This user is in the developer group, as well as a DB Admin. Now, in terms of how to map that map to Teleport, it's simply a one to many. So when the Auth Connector in Teleport is set, you'll say, "Well, this developer maps to a dev Teleport role as well as a builder role," just as we walked through. Those will have their specifics on what nodes and logins are available. And then you also might have a Dev DB role set for Teleport about accessing specific machines. Are you going to allow port forwarding so they can open a connection to the database? That's what's set.
Multiple Roles
Steven: And it's not uncommon for users to have multiple roles. I would say, in terms of best practice, you don't want to try to fit all of their particulars into one role. You'll just find that you're over-complicating it, and it's best to split out those roles and define them for the specific function that that person is trying to do. Now, once you've set these roles and they've authenticated and you've assigned those roles, then you'll be able to access those resources, those Kubernetes and those SSH nodes that you've seen in action already. And you saw it briefly when I first went into Teleport. And we have the idea of the trusted clusters. So that takes a completely separate deployment of Teleport that has its own set of roles, SSH nodes connecting Kubernetes clusters. And that can trust back to a root node. And when you do that, one of the things you're doing there is, not only are you trusting it, but you're now saying you can connect through that root cluster. So in some cases, people do that because that enables me not to have to expose the leaf clusters to the original user. So they can go through just the root cluster, and it uses a reverse tunneling to enable access there.
Mapping Roles
Steven: The other part of it is for mapping roles. So let's say, in those other trusted clusters, you don't have all the same type of machines. You might have only your production, or it's a vastly different type of setup. So that allows you to map those roles to there. You may not have a developer mapping, or there isn't a build type in that particular machine. So that gives you a lot of flexibility in terms of still being able to reuse the same roles but authoring them to use them in a way for that particular cluster. Well, now let's step through actually interacting with these particular roles. And you can see, going from a particular developer role and then how he's going to assist in a problem with production. So going back here in terms of the roles we have, again, we have a set of Dev, Dev DB, Prod, and Staging. And if we look at the Dev, we can see it has a particular Kubernetes setting. You might have set the users coming in, and logins, app user, and DB user, along with the node labeling. Now, this user can also request staging, prod, and monitor.
Steven: Now, this monitor role is meant as a user that can go into the system but does not actually start or stop any of the processes on that system. Really, it's more of an auditing or performance monitoring or checking of those systems. So that's a case of, if you're having an issue in production, it needs someone to access these systems. Rather than giving them the user that can alter the page of those systems, you would first want to give them the monitor user. And then they can go in, confirm if there's an issue, and then go back in and actually alter it if needed, or have someone else who already has that access to do it. So in this case, I'm going to switch to the command line, where the developer user is going to log in. That's where I'm going to be using the tsh client. Previously, you've seen me log in with self Steven, and now I'm going as the development user. Now, every time I log in, I'll get this status telling me what cluster I'm in, in this case the root cluster, what role I have dev, and what logins are available from that. You'll see that there are some additional logins that were in the role because those are coming in automatically from my identity provider.
Steven: And then, from here, I can list what are the nodes available to me. And it's a similar output from the web, in that I see two nodes. In this case, the user only sees two because they only have access to tier dev and build and app. You can open up a SSH session and, just like before, interact just as in a normal SSH session. Now, the scenario we're encountering here is that we're seeing different behavior in our production node than we are in some of our dev and our staging environment. We'd like the developer to investigate that. We've looked at it, and we're not quite sure why that's happening or if something changed in the environment. And the developer is going — we're tasking him to do that. So in this case, again, he doesn't have direct access to that node. So in order to do that, he's going to say, "Okay. Well, I know I can request prod." So he's going to put his request roles for prod.
Steven: So he's requesting for prod. He gets a request ID. Now, as I mentioned before, that really isn't the right policy. He's directly going into a role that will have the app user and will be able to alter the behavior in the system. So we see that I'm someone in this private channel who's able to approve or deny what's coming in. And I see that he's in the example cluster with user and prod, and it's pending. And then I'll speak to him, and I'll remind him that that's not our protocol. You need to get the monitor type role and go and investigate the system before you jump to using an account that can actually alter the system. So I'm going to deny that. Now we can come back. We can see that his request can be denied. They'll have the little chat and see, "Oh, right. I'm supposed to go in and use the monitor." So in this case, we're going to approve having the monitor. Once we've been approved, now his roles have updated. So he has both the dev and monitor roles, and he now has the monitor user. So now he can see all the prod and staging systems, but he's only able to access it from monitor.
Steven: And let's see an example of where he might forget and say, "Okay. I'm going to go ahead and log in as app user into this prod machine." And that will be denied specifically because, while the monitor role is giving that access, it's not giving the app user access. And while he's doing that, that's also going into our audit log. So you can see the actions of the access request, denial, approved, and then a inappropriate attempt to access a system we don't have access to. He'll say, "Okay. I'll do this right, and I'll go in as monitor." Now, once he's in the system, he's going to check to see, "Okay, what Python program are we running right now? Oh, it's the app1. We actually want to be running the app2.py file instead. Okay, I need to do that." He's going to exit, let the manager know that, "Okay, I can do that. But I'll need the production access." Again, he's going to ask for prod. But because he has that justification, they should grant it. And now, because there is a reason to go into production and use the prod role, he will be approved.
Steven: And again, his status is updated. He gets the additional production role, and he'll be able to go into the system. Now, while he's in the system, because this user has access as well to that, he can see the active session and also join him at the same time. So that's part of our joining session feature. It allows you to share a session, both for monitoring as well as for, in some cases, educational. So again, we're going to look for that Python script running. I'm just going to kill it and start the right one. All the while, the user in the web is able to monitor that. And he can also interact with it if he chose to as it's being — whenever you're in here, you can see who's in here as well. So often, this is something [inaudible] we use as part of a two-person role which you always need two people. So now, we start to write application. And we hope that that was the right thing to do in production and we're seeing the correct behavior. Now we can see that session goes away. Their certificate will expire, and they would no longer have access to that production system.
Talk Summary
Steven: So today, we've seen that, using a particular least-privileged approach, the user has access to their default developer nodes as what is their day-to-day job needs. In some cases, they can get access to monitor that system or check on the status in other more privileged or sensitive systems. And then, when needed, they can actually go in, alter a process or start a new process with a particular role. And we've shown you how, in some cases, it's best not to just give the direct prod access but give a particular — again, least-privilege, when needed, access to those sensitive systems. How you can do this easily using Teleport? Be reminded this is an open source product, written in Go. It's available from my GitHub. You're able to download it. It's written by engineers for engineers. So it's really meant to be as seamless as possible for your day-to-day interactions, just like regular SSH and kubectl interaction. It doesn't get in the way. It's intuitive to get into the website using it from the command line. And it's fully compatible with SSH. You can use it with your existing open SSH servers or using it with the Teleport agent.
Steven: In terms of what we've covered today, this is the URL that covers role-based access control for SSH and Kubernetes in our documentation. You're also welcome to download Teleport directly as our community version. And please, go to GitHub. You're welcome to pilot it yourself, try it out, as well as contribute. So that wraps up my session. I'm going to turn it back to Vance and see if we have any questions.
Vance: Steven, great session, really great look at ways to give role-based access to all these distributed environments for all sorts of modern application architectures. And who doesn't love a demo? And it was really great, all the different real-world use cases you showed us. So terrific session.
Steven: Thank you.
Vance: Steven, you mentioned there might be some questions. There are a couple of really good real-world implementation ones in here. But before we get into it, let's take a top-level view, so many cool features here in Teleport. Let's be clear about how you brought together support for an SSH gateway and native support for Kubernetes. The question here says, "Could we authenticate once to access both Kubernetes clusters and any underlying servers?"
Steven: Yes. Part of your role when you authenticate is you're given a kubeconfig, and that allows you to connect through the proxy to the Kubernetes cluster. To the normal user, that doesn't appear any different. But it's actually going through our proxy and ensuring that you're only going in as the correct user or group.
Q&A
Vance: Fantastic. I mentioned a few minutes ago the idea that you had some great real-world situations. Here's a question that came up. You mentioned your support for identity providers. If a user's role changes at their identity provider, when does that take effect in Teleport?
Steven: So we're authenticating against the provider, and they're sending the traits or claims back to us. So the next time that they authenticate, that's when we'll apply those particular roles, as well as the metadata you saw. So if I had my user, Steven, come in or app user, the next time, I can log out, log back in, and then that will enforce it. So that's one of the good things about using short-lived certificates, is then you're always getting the latest every time they authenticate. So you're not just getting a static key that sits there and has the same type of authentication settings.
Vance: Oh, great. Great. Another question, just the levels at where you are in the application, let's call it the master toolkit environment. What is the difference in your view, Steven, between Teleport and open SSH?
Steven: The major difference is you're able to control who can go in and how they go in, in a central place. Whereas, typically with open SSH, you are distributing keys that provide one or more types of logins. You're allowed to log in as root, or you're allowed to log in as another user. But it's not really centrally maintained. It's often not in real-time. And to update that can be difficult. And also, it's not always secure because I have to transmit those files in various ways, store them, track them. That makes it difficult. Whereas, with this approach, because we're always confirming that your role is allowed to access that — and if the role changes, Teleport will enforce that. It won't let you access it anymore. If your role changes from — let's say we take away app, and you're only able to get into builder, then you'll only get into builder. So it's really keeping that at a central place versus really individual user management, which is harder.
Vance: Excellent. Excellent. I see time is just about up. But I think we've got time for one more. Let's turn back to your demo and features again. Someone pointed out some of the capabilities to extend Teleport. And the question is why does Teleport have access plugin roles in your configuration?
Steven: And that's because we have a workflow API. And that's how those plugins are interacting with Slack or Jira or GitLab. And it also acts as a user. So whenever you're in there, everything has to be authenticated and either allowed or deny particular actions. So that access plugin role allows us to say that that particular plugin is allowed to get requests and update and maintain requests. That's fundamental to the way the system works, is using that role base for everything. And also, these plugins are open source on our system as well. And it's pretty transparent what they're doing, but they're doing it in a very secure way, also using certificates in how they connect and making sure that they're controlled in terms of what actions they can do.
Recommended Next Steps
Vance: Steven, with that, I'm afraid time has expired. But we can't let you go until you give us some suggestions on best ways forward. You gave us some great links and other suggestions to get the code and other educational resource. So let's quickly [inaudible] for our attendees.
Steven: First, you can go ahead and download the community version. And then we have a number of quickstarts in our documentation on how to quickly get up and going, both for getting SSH access and then Kubernetes. We have a series of videos that take you through the various features of Teleport. And that's available within the documentation and some of our other resource pages. That's a great way to get into it to understand some of the more complex types of interactions that you might have, not just the first login. And again, it's written in Go. It's open source. You can get it from GitHub. You can compile it yourself; check it out. Again, those plugins could be extended. So there's a lot of extensibility and ability to really transparently see what we're doing at Teleport through our open source approach.
Vance: Love it. Love it. Steven Martin, solutions engineer at Teleport, thanks again for a great overview of Teleport and how it works in real life and especially for all the next-step assets that you provided us today. A really great session, and we're really glad you could come.
Steven: Thank you for having me.
Vance: My pleasure. And as Steven mentioned, many of the links and other resources he mentioned are going to be locally right here in the breakout room. So please, take advantage of those. And as you can see, lots going on at Teleport these days. We couldn't fit everything. So here's a slide that'll take you directly to the Teleport website with some other terrific resources. Just download Steven's slides, and all of these links will be live. Thanks again, everyone.
Join The Teleport Community