Federating Amazon Redshift Access Using Teleport
Overview
This video is a deep dive into providing secure access to Amazon Redshift. It introduces Amazon Redshift, explains the setup, and provides a helpful demo of how it works.
Federating Amazon Redshift Access Using Teleport
Ben: Hi, I'm Ben with Teleport, and today I'll be walking you through how you can use Teleport to protect Redshift clusters.
Redshift
Redshift is an AWS data lake, but, effectively, it's just a database. We have a few reasons here about things you can do with Teleport's Database Access. One of the main benefits is that users use short-lived database credentials instead of usernames and passwords. This all goes through a single sign-on flow, and then this helps maintaining org-wide identity much easier. So as people come and go, you know who has access and who doesn't. It also lets you provide full capabilities for querying and auditing. So what actions do people do on the database where they query their own tables? And then lastly, you can implement more advanced logic on your database users. So you can really restrict access, and then if people need to access certain tables, you'll go through an access request workflow.
Setup
Ben: This video is going to mainly focus on what it looks like and sort of initial setup. So we have full documentation here. Redshift is a data lake product from Amazon, but you query it using Postgres. So there's a few prerequisites. I already have a Redshift database. I guess they call it a data warehouse. I think the data lake is slightly different in my world of data, but here we go. So I have one Redshift cluster, and I've actually created this using Terraform. This is a small instance. If you're doing this too, dc1.large is one of the cheapest. But just be warned, it's still 25 cents an hour, which works out $108 a month. So if you were testing this out and you didn't have a huge budget, fire it up and fire it back down again.
Ben: I also have created a custom VPC security group, and I have subnet
set. Here, you can see I have created a master username. I've got this Teleport, and then I just create a random password for this. But we're not using username passwords, we're going to be using short-lived credentials. So we have information here about the admin username. So this is an AWS user, same one with Teleport. Next up, you need to create an IAM policy. So Teleport uses IAM policies to authenticate with the Redshift database. It's enabled by default for all Redshift databases, but you need — for Teleport — to create these temporary IAM tokens — you need to create an IAM role and then apply it to the database instance. So, in my case here, it's using a wildcard. This will be your account ID and then the dbuser
will have these as wildcards. Lastly, your [inaudible] will have the ability to allow DescribeClusters
. So we have a few more information here about how you'd want to configure it.
Ben: Depending upon your organizational structure, you may want to really narrow in what Teleport can get access to. And lastly, there is a note here like, Teleport does not auto-create users, so the user must already exist through information about setting up the Teleport Auth and Proxy service. This sort of you can follow our Quick Start, something kind of similar. I'm going to skip through this, but there are a few important things here, so. One thing we do here is update the trait to provide the db_user
. So, in my case, I use Teleport, but in this demo we use awscluster
, and then we add the db_name
. So if I come in here, you can see I have a redshift
user, which is a db user, and then you can actually access all db labels.
Ben: And then I have a Teleport database service — this is an instance which I have in the same VPC as the Redshift cluster, and this will communicate directly to it and then be able to create these temporary tokens. When you create this, you have to add the uri
of the cluster and then the port. And this is a unique port for Redshift, so it's 5439
, the region, and then the cluster ID. So I can show you what mine looks like. I also add it to Teleport for debugging purposes. So we have the Redshift. The protocol is Postgres. Here's my cluster, and it's terraformed a Redshift cluster similar to the one I showed you.
Ben: Terraform makes it very easy to also create the IAM roles that you need. I can give you a little snapshot of what that looks like for me. So we have an IAM role policy: get cluster credentials, and anything that I've changed here, because everything is a wildcard, is the caller_identity.current.account _id. And then this gets applied to this instance, so we can create these temporary tokens for you. And then last up, where we say "AWS credentials", what this means is you need to apply the AWS role policy to that instance. So I can also show you what that looks like. So here I have the instance, which is sort of my bastion. I have rds_role
, which is both MySQL, Postgres, and Redshift. If I come in here, there’s an in-line policy that they can see the same thing. It's been applied. And then this means on behalf of this instance, it can go and create these credentials for you.
Demo
Ben: So let's connect. So I'm going to just log out, just to have a clean — terminal-side, I've logged in using my GitHub, I've already authenticated with GitHub, so it's already logged me in. And here you can see I have three databases. I need to log into Redshift to get the short-lived credentials. And here we have the information, so we connect using psql
. So, in my case, my dbname
for Redshift was demo_ redshift_db
cluster. And then I'm using the teleport
user, but I'm not logging in with the password. I am missing a quote, and then here we are. So I'm now logged in to the cluster using the teleport
user. What you'd want to do is you'd want to create dedicated users, which would have more fine-grained controls. So you might have a decent auditor role which can just, like, view tables and then you can link that to identities in Teleport using roles or go through access workflows.
Ben: So if I come back to Teleport, all of this information for the Redshift cluster is recorded, and then you can see if any query has been executed. So then you can find out what the query has been around and all the other information about that session. So this sort of brings me to the end of using Teleport with Redshift. If you have any questions, feel free to contact us on our GitHub discussions, Slack community or forum. We're more than happy to help. Thanks for watching.
Join The Teleport Community