Securing SSH and Kubernetes Access
Using Teleport to Secure SSH and Kubernetes Access - overview
As part of Teleport's webinar speaker series, Colin Wood gives a great demo on how to use Teleport to Secure SSH and Kubernetes access.
Key topics on Using Teleport to Secure SSH and Kubernetes Access
- The four elements of access are connectivity, authentication, authorization and audit.
- Any solution that doesn't provide the four elements will leave you vulnerable or will severely reduce your productivity.
- Teleport is a solution for secure developer access that doesn't get in the way.
- This webinar presents three demos: Teleport RBAC, Teleport access requests and Teleport session locking.
Expanding your knowledge on Using Teleport to Secure SSH and Kubernetes Access
- Teleport Kubernetes Access
- Teleport Server Access
- Teleport Application Access
- Teleport Database Access
- Teleport Connect
- Teleport Machine ID
Learn more about Using Teleport to Secure SSH and Kubernetes Access
Introduction - Using Teleport to Secure SSH and Kubernetes Access
(The transcript of the session)
Colin: 00:00:01.674 Hello everybody. My name's Colin Wood. I am a senior solution engineer here at Teleport. Hopefully everybody can see a slide that says, "The easiest, most secure way to access all your infrastructure." And that's what we're going to talk about today. And I'm going to demonstrate Teleport to you in three different scenarios. So let's dive in. I'm going to give a brief overview of Teleport itself, then we'll dive into the demo. So the Teleport access platform is zero trust, certificate-based, and it provides advanced just-in-time access workflow. It gives complete visibility into access and behavior for human and machine users. Our goal is to bring those capabilities deep into your infrastructure at the protocol level. You can see the five main ways we divide the protocols on the screen here, so Database Access, Application Access, Desktop Access, Kubernetes Access, and Server Access. So Teleport Server Access, for instance, gives you visibility into all your SSH sessions. Kubernetes Access can do things like enforce two-factor authentication for kubectl commands. Database Access can enforce [inaudible] level roles, log all the queries that hit your Postgres or MySQL database, for example. And Application Access can replace your need for VPNs and internal DNS solutions, while still delivering secure internal web applications to your users — things like Jenkins or Grafana, Kibana, etc.
Essential elements of infrastructure access
Colin: 00:01:58.011 And finally, Teleport Desktop Access gives you modern passwordless access for all your Windows servers and desktops, including complete session recordings. So when we think about infrastructure access, what I'm going to demonstrate today are the four essential elements of infrastructure access. The first is connectivity. We're going to show, today, exactly how you can access your resources in a secure way. And if access can come from anywhere, and your compute could be happening anywhere, Teleport can help you securely connect to any resource, regardless of network boundaries. The next element is authentication. So we need identity-based authentication for engineers so we know exactly who or what is accessing each resource. So, for example, if someone's accessing a resource as EC2 user, we still need to be able to link that back to their identity. And so once we've handled authentication, we need authorization, of course, and Teleport provides fine-grained, role-based access controls. So we can define who should get access to what, and that "who" could be either human or a machine. And finally, of course, if you've connected, authenticated, and you're authorized for a system, we need to audit what you're seeing. We need to know who or what accessed each resource, when, and what was done. And any solution that doesn't provide these four elements of access will leave you vulnerable or will severely reduce your productivity. So one of the ways we achieve this is through the use of identity certificates.
Secure access that doesn’t get in the way
Colin: 00:03:57.435 So we wanted to create a solution for secure developer access that doesn't get in the way. It doesn't require you to generate keys — distribute those keys. It doesn't require you to memorize or store long-lived usernames and passwords, etc. We want to use short-lived certificates out of the box, as opposed to other credentials like SSH keys. So assuming you're using a single sign-on provider today, we'll get your identity enroll or group from that, and we'll encode it and the privileges that go with that role and identity into short-lived certificates. And then those short-lived certificates are what gets you access to all the resources you need. So we bring native support for identity-based access for all resources by enforcing automatic and painless certificate-based authentication and authorization. So in addition to being built on short-lived certificates, we're also built on the idea of role-based access. And the metadata in those certificates means we can provide rules for role-based access. Those rules can be as simple as a developer never has access to prod, or a contractor should only have access to a certain project. With static credentials, creating that sort of granular access takes a lot more work. You have to make sure you distribute the SSH keys, for example, only to the correct systems. And then as people move from one project to another, you have to remove them. By using role-based access linked to your identity, any changes or any updates are made in one place in your identity provider. So the next question is, now we've got your short-lived cert — we've got role-based access. How do we give visibility to our security teams? So that's our audit layer.
Colin: 00:05:59.000 So we all know security now is much different than it was not long ago. It's not just VPN, and then everything's inside the VPN; it's as if you were sitting at your desk. Employees at different levels need to be able to connect your infrastructure from anywhere. And that infrastructure could be multi-cloud, hybrid cloud; it really could be anywhere. So that creates risk, as we all know, and so we need to address that risk. To do that and to protect all these modern applications, we need to master those four elements of infrastructure access that we started with. The problem with addressing those is that they often — it forces trade-offs. So VPNs will focus on connectivity, but they won't provide much when it comes to fine-grained, role-based access control. Shared credentials or a secret vault still leave you to figure out the remote connectivity and auditing, but they'll help you with the credential problem. Although, anything that uses any shared credentials will still create vulnerabilities. Legacy PAM solutions are definitely stronger than what I've mentioned so far, but they often tend to be bulky and tend to slow things down a bit, and they don't offer a lot for machine-to-machine access. So Teleport is designed from the ground-up to be different. We want to consolidate those four essential infrastructure access capabilities: connectivity, authentication, authorization, and audit. We want to unify access policies for engineers and machines using the same certificate-based approach.
Colin: 00:07:54.134 In the end, it won't be only more secure — it'll be easier and it will improve productivity. So how do we deliver on those sort of promises? So the first is connectivity. So this provides secure infrastructure access for people and service accounts with that single identity-aware access proxy. So our zero trust access solution unifies access to your infrastructure without relying on the VPN via the Teleport proxy. The next problem is authentication. So for authentication, we want to rely on your identity-based — so your identity provider. We want to integrate with your SSO so that you have a single location to onboard and offboard employees; no more cleaning up keys or credentials across lots of systems. We will leverage that SSO integration and generate short-lived certs based on your role or group. And then those short-lived certs can be part of our role-based auth. So Teleport authorization lets you implement fine-grained access control for every employee in service that's accessing your infrastructure, by giving you that unique identity to the employees, CI/CD servers, service accounts, and bots. That identity is then synchronized across your entire infrastructure so that each node — each server — each database understands whether your certificates are valid and you should be granted access. And lastly, what comes with all of this is a great audit layer. So we strive to provide unprecedented visibility into your infrastructure access.
Colin: 00:09:54.132 This really helps our clients meet and exceed their compliance objectives by recording interactive sessions, file system changes, data transfer, command executions, and other security events. And then they're all stored in the audit log, which can be, then, exported to your SIEM or your other log aggregation solution. So with that, that brings us to the end of our brief slide presentation. So now it's time to turn our attention over to a product demonstration. So the first thing we're going to demonstrate is using Teleport along with its underlying role-based access control to access a system. So typically, if you're looking at a Linux server or a server, and you want to SSH in, you may be VPNing first, or then you're SSHing directly onto the server. This doesn't give you a great audit trail. And if you want audit, then you have to come up with some audit aggregation solution to aggregate the logs from all your servers, which is typically separate. And then it also introduces the key management problem: key generation, key distribution, key rotation, key retirement, and key audit. Which is not a great problem to deal with. So basically, at Teleport, rather than saying, like a lot of solutions do, which is, "Let us manage keys for you," we say, "Throw away the keys, and let's use certificates instead." So let's take a look at what this looks like, quickly. So in order to do that I'm going to switch tabs here. I'm going to pop over to a Teleport proxy that I have set up.
Demo 1: Teleport RBAC
Colin: 00:11:54.160 So this is the Teleport web UI. It's going to allow me to log in to Teleport as a user. I'm going to have a few different windows open as we go through the demo, and I'll try to keep things really clear on what you're looking at and what context each window's providing as I interact with the system. So as I mentioned, the first thing we need to do is authenticate. We need to say who we are. So for that we want to integrate with your identity provider — anything that speaks OIDC or SAML. So that could be Azure AD — it could be Okta. I'm going to use Auth0 today. So I'm going to say, "Authenticate with Auth0," and it's going to kick off my workflow. And once I authenticate, what I get is I get a list of the systems to which my role grants me access. So those resources are organized into five broad categories, as I mentioned in the slides. And you'll find those on the left: servers, applications, Kubernetes clusters, databases, and desktops. While I'm not going to belabor each of the five different resource types, I will go through a couple scenarios which give you the — hopefully will let you understand Teleport's value proposition, and it's the same value proposition across each resource. For each resource you then just get, as I mentioned, a list of the resources to which your role grants you access. So in the case of SSH, for example, as an end user it's really nice. I don't have to [inaudible] my SSH config file to remember the names of the systems. I don't have to worry if the server's up or not. If it's in my list, it's up and running, and I can access it, because my role provides me access. As your list gets longer, of course, we have search capabilities to search by name or by label.
Colin: 00:13:55.137 And you can label each of your servers, so you don't need to come up with host names that are really long and complicated to let you know the purpose of the server. So you don't need host names like "production database two west." You can do things like I've done and name your servers after Smurfs, and then let labels cover the rest. So if I wanted to know which server's running Elastic, well, I have a label for that. Most of these are just Linux boxes running in AWS EC2 nodes. A couple are in my home lab; you can see here. But let's pick a system and connect to it. So traditionally, to do this, I would have needed to distribute my public key onto the target system somehow, and then I would authenticate with SSH key — present my private key, connect, and then I would be connected. In order to do that, I would need to be able to reach that end system somehow. So it would either have to be publicly accessible or I'd need to VPN first or SSH into some bastion that I maintain that is publicly accessible. In the case of Teleport, this node can be private. It could have no public IP address. It could have no ports open for ingress. Its only requirement is that it needs to be able to open a reverse tunnel back to the Teleport proxy. So for that you could use the Teleport proxy's private internal address, and it could just reach back to the proxy that way. And then when I'm ready to connect, I just hit connect, select one of the logins that my identity and role provide for me, and Teleport will open that connection over the reverse tunnel. So now, from my home office, I have connected to a remote EC2 node using, in a secure way, without needing to open any ports for ingress or VPN. And I didn't have to distribute SSH keys or anything like that.
Colin: 00:15:57.887 So on the node is running the Teleport agent; that's what opens that reverse tunnel. It's a small, lightweight Go binary that provides a great deal of functionality and connectivity. So now I've authorized. I've authenticated. My authorization has been completed, and it's showing me all the nodes I have access to. I've done the connectivity. So now we're on to — the only thing we haven't shown for SSH access so far is audit. So I now need to run a few commands to demonstrate the audit capability. So I'll run a trivial command here — something a little silly to kick things off. So Teleport is going to give me a full session recording of the whole session, and it's going to audit the commands I'm running at the kernel level. So of course it's going to capture that talking cow, but what about a command like this? So in this case, this is just curl of example.com. But to be sneaky, I base-64-encoded it, and then in this command I'm just going to echo that string through to the decoder, and then pipe it into the shell. But what we'll see after we return to the audit logs is we'll see that the Teleport audit logs were able to capture that command that was executed at the kernel, so we'll still get it in our audit logs. So let's end this session and return back to my Teleport proxy view. So now, back at my Teleport proxy view, I authenticated, was authorized, connected, ran some actions. So let's take a look at those audit logs now. So on the left I'm going to navigate to activity, and I'm going to see audit log. Everything I can see here on the left, remember, is role-based. So if you don't have permissions to view the audit logs, you wouldn't see the audit logs.
Colin: 00:17:58.405 But given this is a demo, I do. So there's my audit logs, and I can see them here. As I mentioned, you can export these logs to any system of your choice, and so then you can build alerts and dashboards, etc. So I'm going to backscroll a little bit here to where I logged in and my certificate was issued, and then we can see all the events that took place since then. I started a session on a node, then my node's interface ran some commands to build out that interface. We capture everything that happens. Then there's my cowsay
command. So let's take a look at what an enhanced log entry of a session command looks like with Teleport. So the first thing we notice is we grab the command name — the program that you're running, we get the path to the program, we get the arg list — the list of arguments passed into the program, and we get the return code. So we get everything about the command that was run. We also get what login it was run under, and perhaps, most importantly, all of that is linked to my identity from my IDP. So I get a full record of exactly who did what, even though I used the Ubuntu login. So if everybody in your organization's using ec2-user or root, you'll still know who did what. Returning back to my list, we can then see there's that base 64 and the pipe to the shell. But there's that curl
command that I mentioned we would log. So we can see we got the curl
command, we got the argument of example.com, and it's all, again, linked to my identity.
Colin: 00:19:52.277 Moreover, we actually captured the network activity that was generated by that curl
command, so I can see that destination address and port was captured, again, linked to my identity. Finally, I cleared the console, I left the session, and we got our session recording uploaded. So if I navigate over here to the left, and I click on session recording, we'll see today's session recordings. So this is the one we just completed. And I can hit play, and I can actually watch back a full record of exactly what took place. And my favorite feature of this is this is text-based. So if I want to reuse a command or I want to investigate a command, I can simply grab that and copy the text from in there. All right. So if I return, briefly, back to my slide deck, this was the first demo I wanted to show you, which was a straightforward developer accessing a Linux server via the Teleport proxy. So I'm able to take my private nodes, keep them private, but then deliver secure access to them via the Teleport proxy server. So I'm going to switch to a slightly different slide now, just to give you an idea of what that looks like. So what we did, in reality, was we kept our SSH server private. We kept it in that purple box. And then me over here — I was the developer. I used the web UI to access it via the Teleport proxy, which, essentially, acts as our bastion.
Demo 2: Teleport access requests
Colin: 00:21:51.123 So that's just a quick summary of what we saw. So now, returning back to that first deck, the next thing I'd like to demonstrate is how just-in-time access requests work. Because, of course, what we saw in the first demo works really well and is fantastic, as long as you stay in your least privileged default role. But we know that often we need to escape that default role. We need elevated privileges — we need a different role for a project that we've been added to — we need access to a system that we don't have access to by default, ever, perhaps because it's a production system — we need to be able, then, to handle those scenarios. So in order to do that, let's pop back over here. In this window I'm authenticated as me. So this window right now is maximized, so I'm just going to get it out of full screen, which I was using for the slides, and I'll move it up here to the top corner. And over here in this other window I'm going to authenticate as a different identity — a different persona, let's say. So up here I'm authenticated as me in my own demo environment, where, as you can imagine, I have all the privileges. But over here I'm going to authenticate now as contractor Caleb. So contractor Caleb has a little bit of a different privilege. He can see only certain servers. So he's only seeing servers with dev labels, let's say. But you know what? In fact, let's have contractor Caleb connect via the command line.
Colin: 00:23:57.024 So contractor Caleb can also — let's clear that out — authenticate via the command line. So Teleport is not only the web UI, but we also have a CLI, which we call tsh
. So the first thing I'm going to do is validate if I've authenticated, and then — you know what? Actually, the real first thing I should have done is make this a little bigger. So since I'm not logged in, the first thing I need to do is a tsh login
. I think you can probably imagine what's going to happen. It's going to kick off my single sign-in workflow, and I can log in as contractor Caleb. All right. And if I return, we see contractor Caleb gets a view showing him exactly what role he's got, what permissions he has. So the same thing we did from that web UI, contractor Caleb can do here. He can say tsh ls
and it can show him the servers he has access to. He could do the same for databases. He could do the same for applications — and we'll see he has none — etc. But for this example, let's do Kubernetes. So through his work today he discovers that he needs access to a Kubernetes cluster. So what he's going to do is he's going to request access to that Kubernetes cluster. And so to do that he's going to ask for a different role. He's going to do a new tsh login
, but this time he's going to request the role for stage environment access. So in this way, his role of contractor allows him to request another set of limited — a limited set of rules. So, for example, he can request this role stage environment access —
Colin: 00:25:56.442 — except I have to type it correctly — request roles, because you can request multiple, and he's going to wait for the approval. Now, if you recall, up here in this left window I'm still authenticated as me. And what I can do is I can go over to my access request workflow, and I can see that there's a pending request from contractor Caleb for access. So he's requesting stage environment access. He'll get 8 hours of access. So this is short-lived, just-in-time access, and the duration is configurable. So, for example, if you were requesting developer database access, you might only get — you might get 12 hours of access. But if you were requesting access to production systems, you might only get 1 hour of access. And then based on the role you're requesting, we can also have different approval requirements. That is to say, if you were requesting that production access, we might need the SRE team and the audit team both to approve it. But if it was a test database, you might only need one member of the database team. I have a simple approval workflow where one user can approve it. And we can put a little message in there, and he can submit the review. Now what contractor Caleb will see is he'll see his approval. By the way, in order to reduce friction and to speed things up, once he requests approval, he can have it not wait for approval; he can put a no wait flag and it'll just let him continue working. And his approval can send notifications out to Slack, PagerDuty, Jira, Mattermost, email, etc., so that he gets that actioned quickly. And if he's an on-call developer — an on-call SRE, for example, and he's the scheduled user in PagerDuty, PagerDuty can be configured to automatically approve his access, based on the on-call schedule.
Colin: 00:27:57.165 So let's return to our scenario. He wanted to access a Kubernetes cluster, so he requested that elevated access. So now when he types tsh kube ls
we'll see that he has that elevated access. So he can do tsh kube login
and select the cluster he wants — smurf-village-2 — and that'll generate him some secure certificates. And hopefully when we do this it'll say, now that it's selected — what that means is his native Kubernetes commands — say he does kubectl get nodes
. So this is a small EKS cluster that's only got one node, so he should see one node there. So he gets that direct connectivity via the Teleport proxy secured to the node. And what do we expect to see back here in our audit log? Well, we expect to see that Kubernetes request. First we definitely see his access request - excuse me - his access request being reviewed, and his new certificate being issued, and then we see the Kubernetes request. And we can see that he ran a GET on nodes; and again, all linked to his identity. So what we just saw was a user whose base access didn't provide them the capability to hit a certain system or cluster. And they, then, start a short-lived access request, get their new certificate, and then are able to successfully access that system. So that is our second demonstration.
Demo 3: Teleport session locking
Colin: 00:29:59.887 So let's move on to the third, which is session locking. So what if, for example, contractor Caleb in the web UI was in a given system, but is, maybe, doing something suspicious? So let's say, in the web UI, he can also adopt that same role. So let's have him adopt that role. And we'll see here he sees that same view into his — that he's in another role, and he can see the list of systems to which that role grants him access, one of which is a system called chef-smurf. So this is an EC2 node in the staging tag, so that's why he now sees it with the stage environment access, and let's have him connect to that. And he's going to connect his ec2-user. So now he's connected as ec2-user. I'm going to, up here in this other window, navigate back to my Teleport proxy, and let's have him run a couple commands. Again, these commands are pretty arbitrary. This time the cow will say, "Hello." Well, if you're working from an audit perspective, what you need to be able to do is you need to be able to say to yourself, "What is happening right now?" You need to be able to go in and see the active sessions, which we can do here. Again, this is all permission-based. If you have the right role, you'll be able to see active sessions. All right.
Colin: 00:31:57.483 Not only that — you have the ability, if you have the right role, to join the session and see what is being done here by contractor Caleb. So if contractor Caleb was about to do some malicious event like sudo rm rf
or something like that, you could quickly exit your view of the session, or you could quickly, maybe, navigate to another window. So over here I'm going to repurpose this terminal we were using. So I'm going to do a quick log out. Log out user equals contractor. And I'm going to log in as myself again, so we can see how you can do this in the CLI. So what we can see here is we could see what he's doing. We join the session, and we see what he's about to do. And maybe, then, we want to go in and say, "Lock his session." We want to kick him out. We can do a lock command on [email protected] and sent him a message. And we can put a time to live on the lock. I'll put a 1-hour lock. And I'll fix my typo first. This type of lock can be automated as part of an alert system.
Colin: 00:33:52.314 So if you are exporting your logs to an alert system in real time, you could be detecting activity that's suspicious and automatically locking sessions. And what happens when we issue that lock is — you'll notice over here his session was disconnected. And if he tries to refresh, you'll see he's getting an access denied error. So contractor Caleb's been removed, but we can see here we were not kicked out of that session. So we can, then, safely end the session for them and return back to the web UI. All right. So that was a quick third demonstration of how we can lock session access, locking out a malicious user. And that brings me to the end of what I wanted to demonstrate today. I wanted to keep the demonstrations fairly brief. What I'd like to emphasize is, what we saw in Server Access is the exact same capability that we can provide for servers, web applications, Kubernetes clusters, your databases, and your windows desktops. That is — you can keep them private and secure without needing to give them public IP addresses or open ports for ingress, generally, yet have authenticated and authorized users get secure connectivity onto those resources, while at the same time generating a robust audit log.
Conclusion
Host: 00:35:43.876 We just want to thank you all for joining us today. Colin, any last words before we say goodbye?
Colin: 00:35:50.842 No, other than to say thank you so much for your time today. We're really happy you were able to join us.
Host: 00:35:56.968 Thank you for joining us, and we'll see you next time.
Join The Teleport Community