Simplifying Zero Trust Security for AWS with Teleport
Jan 23
Virtual
Register Now
Teleport logo

Home - Teleport Blog - Access AWS RDS Databases in EKS Using Teleport Machine ID Without Passwords - Dec 3, 2024

Access AWS RDS Databases in EKS Using Teleport Machine ID Without Passwords

by Gavin Frazar

Teleport Kubernetes Database Access without Passwords

At Teleport we love modern infrastructure and open-source software, but don't like static credentials and passwords.

This created a challenge for us when deploying Temporal, an open-source workflow automation software on EKS: Temporal always requires a password to authenticate to the backend RDS database.

To solve this problem, we turned to Teleport Machine & Workload Identity. Teleport Machine ID issues and rotates short-lived X.509 certificates for services and provides sidecars that help integrate software that expects passwords.

This post walks you through the steps we took to deploy Temporal in AWS EKS with a sidecar that allows Temporal to connect to an RDS database via Teleport - eliminating the need for passwords, API keys, or other shared secrets.

Along the way, we will leverage automatic user provisioning with PostgreSQL role assignments in RDS, which simplifies access control and gives you a central overview of who has access to what via Teleport.

Prerequisites

  • v16+ Teleport cluster
  • AWS RDS PostgreSQL instance
  • AWS EKS cluster
  • AWS credentials in EKS via IAM roles for service accounts (IRSA) or EKS Pod Identity

How it works

To access the RDS database, we have to deploy a Teleport Database Service in a network that can reach the database and provide the Database Service with AWS IAM credentials.

We will deploy the Database Service as a container on AWS Fargate for ECS, in the same VPC as the RDS database, with IAM credentials provided by the ECS task role.

The Database Service establishes and maintains a reverse tunnel connection to our Teleport cluster, so it does not need a public IP and we don't have to expose any ports. Teleport RBAC restricts user access to the database via the Database Service.

Teleport users will authenticate to the Teleport cluster and receive short-lived X.509 certificates for their identity. They will use those certificates, as a client, for mutual-TLS (mTLS) to connect to a Teleport Proxy Service.

The Proxy Service will forward user traffic over a reverse tunnel to a Database Service and the Database Service will forward the user traffic to the database. Because the Database Service is deployed in the same VPC as the database and establishes an outbound reverse tunnel with the Proxy Service, it can provide access to databases in private networks.

Diagram of EKS deployment showing the interaction between a Temporal container and its tbot sidecar, the other Teleport components, and the RDS database
Diagram of EKS deployment showing the interaction between a Temporal container and its tbot sidecar, the other Teleport components, and the RDS database
Deployment Overview

A machine identity can also be granted permission to connect to databases in Teleport, just like any user. In Teleport, a machine identity is represented as a Bot.

Teleport provides a utility called tbot that can provide Teleport credentials to workloads. tbot will be deployed in EKS alongside Temporal as a sidecar container listening for TCP connections on the loopback interface, i.e it will act as a local proxy.

Temporal will be configured to connect to its persistence and visibility SQL databases at a localhost:5432 endpoint - it does not need to know that the connection is going through Teleport nor does it need to be configured with any long-lived secret like a password.

Teleport will be configured with a join token that delegates authentication to AWS IAM. The EKS deployment will provide AWS IAM credentials to the tbot sidecar that tbot will use to sign a token proving its identity to Teleport, thus allowing tbot to join our Teleport cluster. This setup avoids the need for long-lived secrets such as an access key.

Step 1/6 Enable IAM Authentication for RDS Postgres

The first thing we need to do is configure our RDS instance such that we can access it via Teleport.

Teleport uses IAM authentication to access AWS RDS databases, so we should ensure that IAM auth is enabled on the database:

Screenshot of the AWS console setting that enables RDS IAM authentication for an RDS Postgres instance
Screenshot of the AWS console setting that enables RDS IAM authentication for an RDS Postgres instance

Next, we need to connect to the database to create a database user.

Our example database is in an isolated subnet that can only be accessed from inside the VPC and its security group rules only allow inbound traffic from AWS resources that have a specific security group attached to them.

Rather than temporarily opening up public access to the database, we can use a custom CloudShell environment - psqlis installed by default in every CloudShell.

Create a CloudShell environment in the same subnet with a trusted security group attached to it:

Screenshot of an AWS console dialogue box that creates a custom AWS CloudShell environment in the same subnet and security group as the RDS Postgres database
Screenshot of an AWS console dialogue box that creates a custom AWS CloudShell environment in the same subnet and security group as the RDS Postgres database

Now connect as the database master user using psql:

[cloudshell-user@ip-192-168-21-105 ~]$ psql "host=gavin-tf-rds-postgres-instance.c3fvtofx6sjs.ca-central-1.rds.amazonaws.com user=master dbname=postgres"
Password for user master:
psql (15.8, server 14.12)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, compression: off)
Type "help" for help.

postgres=>

Teleport supports auto-provisioning users for PostgreSQL databases so let's set that up to make it easier to provision access for others going forward.

For that we need a postgres user with IAM auth enabled and permission to create other postgres roles with IAM auth enabled.

postgres=> CREATE USER "teleport-admin" WITH CREATEROLE;
CREATE ROLE
postgres=> GRANT rds_iam,rds_superuser TO "teleport-admin" WITH ADMIN OPTION;
GRANT ROLE

IAM authentication for a postgres user is enabled by granting the rds_iam role. We grant rds_iam to "teleport-admin" with the admin option so that it's able to grant rds_iam to other roles after it provisions them just-in-time for Teleport users.

We grant rds_superuser so that we never have to connect via CloudShell as the master user again - it's easier and more secure to just connect via Teleport and we can use Teleport RBAC to control who has superuser access.

At this point we're done using the master user, so we can rotate its password to never be seen or used again.

Finally, we need to tell Teleport that it can use "teleport-admin" to auto-provision other users. We can do that by adding a special tag, teleport.dev/db-admin to the RDS instance:

"teleport.dev/db-admin" = "teleport-admin"

Step 2/6 Enroll the RDS postgres instance with Teleport

We'll use Teleport's AWS OIDC integration to enroll the database into our Teleport cluster and deploy the Teleport Database Service.

The integration provides an RDS enrollment wizard that deploys a Teleport Database Service instance using AWS Fargate for Amazon Elastic Container Service (ECS).

We can find the “RDS PostgreSQL” enrollment wizard under "Access Management" in our Teleport web portal:

Screenshot of the Teleport Web UI Enroll New Resource page
Screenshot of the Teleport Web UI Enroll New Resource page

When you go through the enrollment flow just use the "teleport-admin" db user directly, for now. We'll finish the auto-provisioning configuration next.

Step 3/6. Enable database user auto-provisioning in Teleport

Our RDS database is enrolled and we can connect to it, but we need a Teleport role that allows automatic user provisioning.

Create temporal-db-auto-user.yaml:

$ cat <<EOF > temporal-db-auto-user.yaml
kind: role
version: v7
metadata:
  name: temporal-db-auto-user
spec:
  allow:
    db_labels:
      engine: "postgres"
      teleport.dev/cloud: "AWS"
    # db_names is a list of postgres databases that can be connected to.
    db_names:
      - "{{internal.db_names}}"
    # db_roles is a list of roles the database user will be assigned
    db_roles:
      - "{{internal.db_roles}}"
  options:
    create_db_user_mode: "keep"
EOF

$ tctl create temporal-db-auto-user.yaml

This is a templated role - the {{internal.<trait>}} values are substituted from user traits.

Let's update our Teleport user to assign the temporal-db-auto-user role and modify our user traits. We'll use "rds_superuser" in db_roles and "postgres" (the default), "temporal", and "temporal_visibility" in db_names so that we can access those Postgres internal databases.

$ tsh login [email protected] --proxy=gavin-leaf.cloud.gravitational.io:443
$ tctl users update [email protected] \
    --set-roles=access,editor,temporal-db-auto-user \
    --set-db-roles=rds_superuser \
    --set-db-names="postgres,temporal,temporal_visibility"
# logout and login again to refresh our user traits
$ tsh logout
$ tsh login [email protected] --proxy=gavin-leaf.cloud.gravitational.io:443

In the next step we'll run Temporal's initialization scripts to create the "temporal" and "temporal_visibility" Postgres internal databases. To that end we need the CREATEDB Postgres user attribute. Teleport database user auto-provisioning doesn't support provisioning users with specific attributes like CREATEDB, but we can grant that to ourselves since we have the rds_superuser role:

$ tsh db connect gavin-tf-rds-postgres-instance --db-name=postgres
psql (16.3 (Homebrew), server 14.12)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_128_GCM_SHA256, compression: off)
Type "help" for help.

postgres=> ALTER ROLE "[email protected]" WITH CREATEDB;
ALTER ROLE

Since we used create_db_user_mode: "keep" in the teleport-db-auto-user role above, the user will not be dropped after we disconnect, thus preserving the CREATEDB attribute:

Step 4/6 Initializing Temporal schemas

Temporal provides an SQL database initialization tool to create and configure persistence and visibility databases.

We can use Teleport to provide the Temporal init tool access to our database.

First, start a tsh local proxy tunnel:

$ tsh proxy db gavin-tf-rds-postgres-instance \
  --db-name=postgres \
  --tunnel \
  --port 5432

Next, clone the Temporal repo and make the Temporal init tool:

$ git clone [email protected]:temporalio/temporal.git
$ cd temporal
$ make temporal-sql-tool

Now just point temporal-sql-tool at the tsh local proxy tunnel and run it:

$ export SQL_PLUGIN=postgres12_pgx
$ export SQL_HOST=localhost
$ export SQL_PORT=5432
$ export [email protected]

$ ./temporal-sql-tool --database temporal create-database
$ SQL_DATABASE=temporal ./temporal-sql-tool setup-schema -v 0.0
$ SQL_DATABASE=temporal ./temporal-sql-tool update -schema-dir schema/postgresql/v12/temporal/versioned

$ ./temporal-sql-tool --database temporal_visibility create-database
$ SQL_DATABASE=temporal_visibility ./temporal-sql-tool setup-schema -v 0.0
$ SQL_DATABASE=temporal_visibility ./temporal-sql-tool update -schema-dir schema/postgresql/v12/visibility/versioned

Step 5/6. Configuring Teleport Machine ID

We're going to deploy tbot as a sidecar container in EKS with access to AWS IAM role credentials. This will allow it to join our Teleport cluster using an IAM join token.

We can provide IAM credentials to the tbot container using EKS Pod Identity or IAM roles for service accounts (IRSA).

We'll use IRSA in this example because it works on any Kubernetes cluster, not just EKS. There's a Terraform module that makes it easy:

locals {
  kube_namespace    = "temporal"
  tbot_sa_name      = "tbot"
  oidc_provider_arn = "arn:aws:iam::123456789012:oidc-provider/oidc.eks.ca-central-1.amazonaws.com/id/9E8BC93C5CD913B88DC78C95960A851D"
}

module "tbot_temporal_irsa" {
  source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"

  oidc_providers = {
    main = {
      namespace_service_accounts = ["${local.kube_namespace}:${local.tbot_sa_name}"]
      provider_arn               = local.oidc_provider_arn
    }
  }
  role_name = "${local.kube_namespace}-${local.tbot_sa_name}"
}

The IAM role trust policy it creates looks like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.ca-central-1.amazonaws.com/id/9E8BC93C5CD913B88DC78C95960A851D"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "oidc.eks.ca-central-1.amazonaws.com/id/9E8BC93C5CD913B88DC78C95960A851D:sub": "system:serviceaccount:temporal:tbot",
                    "oidc.eks.ca-central-1.amazonaws.com/id/9E8BC93C5CD913B88DC78C95960A851D:aud": "sts.amazonaws.com"
                }
            }
        }
    ]
}

Next, configure Teleport Machine ID so that it can access the RDS database. For that we will need three resources in Teleport:

  1. a bot identity
  2. a role for the bot
  3. a token that allows the bot to join the Teleport cluster

We already created an auto-provisioning role earlier, so we can use that for the bot role.

Create the bot and give it the temporal-db-auto-user role we created earlier:

$ cat <<EOF > temporal-bot.yaml
kind: bot
version: v1
metadata:
  # name is a unique identifier for the bot in the cluster.
  name: temporal
spec:
  # roles is a list of Teleport roles to grant to the bot.
  roles:
    - temporal-db-auto-user
  # traits controls the traits applied to the Bot user. These are fed into the
  # role templating system and can be used to grant a specific Bot access to
  # specific resources without the creation of a new role.
  traits:
    - name: db_names
      values:
        - temporal
        - temporal_visibility
    - name: db_roles
      values:
        - pg_read_all_data
        - pg_write_all_data
EOF

$ tctl create temporal-bot.yaml

tbot will authenticate itself to Teleport by using its assumed IAM role. We'll restrict the join token to only allow the temporal-tbot IAM role we created earlier:

$ cat <<EOF > temporal-bot-token.yaml
kind: token
version: v2
metadata:
  name: temporal-bot
spec:
  roles: [Bot]
  bot_name: temporal
  join_method: iam
  # Restrict the AWS account and (optionally) ARN that can use this token.
  # This information can be obtained from running the
  # "aws sts get-caller-identity" command from the CLI.
  allow:
    - aws_account: "123456789012"
      aws_arn: "arn:aws:sts::123456789012:assumed-role/temporal-tbot/*"
EOF

$ tctl create temporal-bot-token.yaml

Now in our Kubernetes cluster let's create the "temporal" namespace:

$ kubectl create namespace temporal

Finally, create a ConfigMap for the tbot configuration file:

$ cat <<EOF > tbot-configmap.yaml
apiVersion: "v1"
kind: "ConfigMap"
metadata:
  name: tbot
  namespace: temporal
data:
  tbot.yaml: |
    version: v2
    debug: true
    onboarding:
      join_method: iam
      token: temporal-bot
    storage:
      type: memory
    proxy_server: gavin-leaf.cloud.gravitational.io:443
    services:
      - type: "database-tunnel"
        listen: "tcp://localhost:5432"
        service: gavin-tf-rds-postgres-instance
        database: temporal
        username: bot-temporal
EOF

$ kubectl apply -f ./tbot-configmap.yaml

This config file tells tbot to listen for TCP connections on localhost:5432 and proxy traffic to our RDS database through an authenticated tunnel.

Applications running in the same pod as the tbot sidecar can connect to the local proxy and access the database.

Step 6/6. Configure and deploy Temporal helm chart

We finally arrive at the last step: deploying the Temporal helm chart.

We need to provide the helm chart with values. The tbot sidecar will be listening on localhost:5432 and it will ignore any password sent by local clients, but the Temporal helm chart nonetheless requires that we provide it a password, so just provide it with a dummy value.

Create values.yaml:

serviceAccount:
  create: true
  name: "tbot"
  extraAnnotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/temporal-tbot"

server:
  replicaCount: 1
  config:
    logLevel: "debug,info"
    persistence:
      default:
        driver: "sql"
        sql:
          driver: "postgres12"
          host: "localhost"
          port: 5432
          database: "temporal"
          user: "bot-temporal"
          password: "dummypass"
          maxConns: 20
          maxConnLifetime: "1h"
          tls:
            enabled: false
      visibility:
        driver: "sql"
        sql:
          driver: "postgres12"
          host: "localhost"
          port: 5432
          database: "temporal_visibility"
          user: "bot-temporal"
          password: "dummypass"
          maxConns: 20
          maxConnLifetime: "1h"
          tls:
            enabled: false
  additionalVolumes:
    - name: "tbot-config"
      configMap:
        name: "tbot"
  sidecarContainers:
    - name: "tbot"
      image: "public.ecr.aws/gravitational/tbot-distroless:17.0.1"
      args:
        - "start"
        - "-c"
        - "/config/tbot.yaml"
      securityContext:
        runAsNonRoot: true
      volumeMounts:
        - name: "tbot-config"
          mountPath: "/config"

postgresql:
  enabled: true
prometheus:
  enabled: true
grafana:
  enabled: true
cassandra:
  enabled: false
mysql:
  enabled: false
elasticsearch:
  enabled: false
schema:
  createDatabase:
    enabled: false
  setup:
    enabled: false
  update:
    enabled: false

Now deploy the Temporal helm chart:

$ helm install temporal temporal/temporal -n temporal -f values.yaml

The deployments should all come up pretty quickly:

$ kubectl get deploy
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
temporal-admintools               1/1     1            1           3m
temporal-frontend                 1/1     1            1           3m
temporal-grafana                  1/1     1            1           3m
temporal-history                  1/1     1            1           3m
temporal-kube-state-metrics       1/1     1            1           3m
temporal-matching                 1/1     1            1           3m
temporal-prometheus-pushgateway   1/1     1            1           3m
temporal-prometheus-server        1/1     1            1           3m
temporal-web                      1/1     1            1           3m
temporal-worker                   1/1     1            1           3m

Try forwarding a local port to the Temporal web ui:

$ kubectl port-forward services/temporal-web 8080:8080

Now navigate to http://127.0.0.1:8080/ in your browser.

Review

We have database user auto-provisioning set up, so now we can use Teleport RBAC to easily provision new users and control who or what has access to our RDS database - goodbye shared password vault!

Users and machines can now connect to the database via Teleport, so we don't need to expose the database to the internet and users don't need to worry about where they connect from.

If an auditor needs to review who, or what, has been accessing the database and what they are doing, then they can review a comprehensive audit log and play back everything that happened.

Tags

Teleport Newsletter

Stay up-to-date with the newest Teleport releases by subscribing to our monthly updates.

background

Subscribe to our newsletter

PAM / Teleport