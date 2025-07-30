Storage Backends
A Teleport cluster stores different types of data in different locations. By default everything is stored in a local directory on the Auth Service host.
For self-hosted Teleport deployments, you can configure Teleport to integrate with other storage types based on the nature of the stored data (size, read/write ratio, mutability, etc.).
|Data type
|Description
|Supported storage backends
|core cluster state
|Cluster configuration (e.g. users, roles, auth connectors) and identity (e.g. certificate authorities, registered nodes, trusted clusters).
|Local directory (SQLite), etcd, PostgreSQL, Amazon DynamoDB, GCP Firestore, CockroachDB
|audit events
|Events from the audit log (e.g. user logins, RBAC changes)
|Local directory, PostgreSQL, CockroachDB, Amazon DynamoDB, Amazon S3, GCP Firestore
|session recordings
|Raw terminal recordings of interactive user sessions
|Local directory, Amazon S3 (and any S3-compatible product), GCP Cloud Storage, Azure Blob Storage
|teleport instance state
|ID and credentials of a non-auth teleport instance (e.g. node, proxy)
|Local directory, Kubernetes Secret
Cluster state
Cluster state is stored in a central storage location configured by the Auth Service. The cluster state includes:
- Agent and Proxy Service membership information, including offline/online status.
- List of active sessions.
- List of locally stored users.
- RBAC configuration (roles and permissions).
- Dynamic configuration.
There are two ways to achieve High Availability. You can "outsource" this function to the infrastructure. For example, using a highly available network-based disk volumes (similar to AWS EBS) and by migrating a failed VM to a new host. In this scenario, there's nothing Teleport-specific to be done.
If High Availability cannot be provided by the infrastructure (perhaps you're running Teleport on a bare metal cluster), you can still configure Teleport to run in a highly available fashion.
Teleport Enterprise Cloud takes care of this setup for you so you can provide secure access to your infrastructure right away.
Get started with a free trial of Teleport Enterprise Cloud.
Auth Service State
To run multiple instances of the Teleport Auth Service, you must switch to one of the high-availability storage backends listed below first.
Once you have a high-availability storage backend and multiple instances of
the Auth Service running, you'll need to create a load balancer to evenly
distribute traffic to all Auth Service instances and have a single point of
entry for all components that need to communicate with the Auth Service. Use the
address of the load balancer in the
auth_server field when
configuring other components of Teleport.
Configure your load balancer to use Layer 4 (TCP) load balancing, round-robin load balancing, and a 300 second idle timeout.
With multiple instances of the Auth Service running, special attention needs to
be paid to keeping their configuration identical. Settings like
cluster_name,
tokens,
storage, etc. must be the same.
Proxy Service State
The Teleport Proxy is stateless which makes running multiple instances trivial.
If using the default configuration, configure your load
balancer to forward port
3080 to the servers that run the Teleport Proxy
Service. If you have configured your Proxy Service to not use TLS Routing
and/or are using non-default ports, you will need to configure your load
balancer to forward the ports you specified for
listen_addr,
tunnel_listen_addr, and
web_listen_addr in
teleport.yaml.
Configure your load balancer to use Layer 4 (TCP) load balancing, round-robin load balancing, and a 300 second idle timeout.
If you terminate TLS with your own certificate for
web_listen_addr at your
load balancer you'll need to run Teleport with
--insecure-no-tls
If your load balancer supports HTTP health checks, configure it to hit the
/readyz diagnostics endpoint on
machines running Teleport. This endpoint must be enabled by using the
--diag-addr flag to teleport start:
teleport start --diag-addr=0.0.0.0:3000
The
/readyz endpoint will
reply
{"status":"ok"} if the Teleport service is running without problems.
The endpoint must be exposed on a proxy interface for the load balancer health checks
to succeed. You should only do this on the proxy instances and ensure that
port 3000 is not exposed to the public internet, just the load balancers. For other services, continue to use
the 127.0.0.1 local loopback interface.
We'll cover how to use
etcd, PostgreSQL, DynamoDB, and Firestore storage
backends to make Teleport highly available below.
Etcd
Teleport can use etcd as a storage backend to
achieve highly available deployments. You must take steps to protect access to
etcd in this configuration because that is where Teleport secrets like keys
and user records will be stored.
etcd can only currently be used to store Teleport's internal database in a
highly-available way. This will allow you to have multiple Auth Service instances in your
cluster for an High Availability deployment, but it will not also store Teleport audit events
for you in the same way that DynamoDB or
Firestore will.
etcd is not designed to handle large volumes of time series data like audit events.
To configure Teleport for using etcd as a storage backend:
- Make sure you are using etcd versions 3.3 or newer.
- Follow etcd's cluster hardware recommendations. In particular, leverage SSD or high-performance virtualized block device storage for best performance.
- Install etcd and configure peer and client TLS authentication using the etcd
security guide.
- You can use this script provided by etcd if you don't already have a TLS setup.
- Configure all Teleport Auth Service instances to use etcd in the "storage" section of the config file as shown below.
- Deploy several Auth Service instances connected to etcd backend.
- Deploy several Proxy Service instances that have
auth_serverpointed to the Auth Service to connect to.
teleport:
storage:
type: etcd
# List of etcd peers to connect to:
peers: ["https://172.17.0.1:4001", "https://172.17.0.2:4001"]
# Required path to TLS client certificate and key files to connect to etcd.
#
# To create these, follow
# https://coreos.com/os/docs/latest/generate-self-signed-certificates.html
# or use the etcd-provided script
# https://github.com/etcd-io/etcd/tree/master/hack/tls-setup.
tls_cert_file: /var/lib/teleport/etcd-cert.pem
tls_key_file: /var/lib/teleport/etcd-key.pem
# Optional file with trusted CA authority
# file to authenticate etcd nodes
#
# If you used the script above to generate the client TLS certificate,
# this CA certificate should be one of the other generated files
tls_ca_file: /var/lib/teleport/etcd-ca.pem
# Alternative password-based authentication, if not using TLS client
# certificate.
#
# See https://etcd.io/docs/v3.4.0/op-guide/authentication/ for setting
# up a new user.
username: username
password_file: /mnt/secrets/etcd-pass
# etcd key (location) where teleport will be storing its state under.
# make sure it ends with a '/'!
prefix: /teleport/
# NOT RECOMMENDED: enables insecure etcd mode in which self-signed
# certificate will be accepted
insecure: false
# Optionally sets the limit on the client message size.
# This is usually used to increase the default which is 2MiB
# (1.5MiB server's default + gRPC overhead bytes).
# Make sure this does not exceed the value for the etcd
# server specified with `--max-request-bytes` (1.5MiB by default).
# Keep the two values in sync.
#
# See https://etcd.io/docs/v3.4.0/dev-guide/limit/ for details
#
# This bumps the size to 15MiB as an example:
etcd_max_client_msg_size_bytes: 15728640
PostgreSQL
PostgreSQL cluster state and audit log storage is available starting from
Teleport
13.3.
Teleport can use PostgreSQL as a storage backend to achieve high availability. You must take steps to protect access to PostgreSQL in this configuration because that is where Teleport secrets like keys and user records will be stored. The PostgreSQL backend supports two types of Teleport data:
- Cluster state
- Audit log events
The PostgreSQL backend requires PostgreSQL 13 or later, and, for the cluster
state only, the
wal2json logical
decoding plugin. The plugin is available in packages for all stable versions in
the PostgreSQL Apt and
Yum repositories for Debian- and RPM-based Linux
distributions respectively, or it can be compiled following the
instructions provided in
its repository. The plugin is pre-installed with no extra steps to take in
Azure Database for
PostgreSQL.
CockroachDB can be used as a PostgreSQL drop-in replacement to store audit events (requires Teleport version >= 15.4.2).
Teleport can store the cluster state in CockroachDB but this require CockroachDB-specific configuration. See the CockroachDB backend section for more details.
Teleport needs separate databases for the cluster state and the audit log, and it will attempt to create them if given permissions to do so; it will also set up the database schemas as needed, so we recommend giving the user ownership over the databases.
The PostgreSQL backend for cluster state relies on the ability to use logical
decoding to get a
stream of changes from the database; because of that, the
wal_level
parameter must be set to
logical and
max_replication_slots
must be set to at least as many Teleport Auth Service instances as you'll be
running (a higher number is recommended, to account for network conditions).
The Teleport Auth Service needs to be able to create a replication slot when starting and when reestablishing a new connection to the PostgreSQL cluster, and any long-running transaction will prevent that. It's therefore only advisable to store the Teleport cluster state on a shared PostgreSQL cluster if the other workloads on the cluster only consist of short-lived transactions.
wal_level can only be set at server start, so it should be set in
postgresql.conf:
# the default value for wal_level is replica
wal_level = logical
# the default value for max_replication_slots is 10
max_replication_slots = 10
In addition, the database user must have the
initiating replication role
attribute. In the
psql shell:
postgres=# CREATE USER new_user WITH REPLICATION;
CREATE ROLE
postgres=# ALTER ROLE existing_user WITH LOGIN REPLICATION;
ALTER ROLE
Since replication permissions allow for essentially full read access over the entire cluster (with a physical replication connection) or to all databases that the user can connect to, it's recommended to prevent the user from opening replication connections, or from connecting to databases other than the ones used for Teleport, if the PostgreSQL cluster is shared between Teleport and other applications.
For convenience, Teleport will attempt to grant itself the
initiating replication role attribute, to accommodate the ability of some managed services
(such as Azure Database for PostgreSQL) to create superuser accounts through
their API; this should only be leveraged if the entire PostgreSQL cluster is
dedicated to Teleport.
To configure Teleport to use PostgreSQL:
- Configure all Teleport Auth Service instances to use the PostgreSQL backend in the
storagesection of
teleport.yamlas shown below.
- Deploy several Auth Service instances connected to the PostgreSQL storage backend.
- Deploy several Proxy Service nodes.
- Make sure that the Proxy Service instances and all Teleport Agent services that
connect directly to the Auth Service have the
auth_serverconfiguration setting populated with the address of a load balancer for Auth Service instances.
Teleport must connect directly to the Postgres server.
pgbouncer is incompatible with the Teleport PostgreSQL storage backend.
teleport:
storage:
type: postgresql
# conn_string is a libpq-compatible connection string (see
# https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING);
# pool_max_conns is an additional parameter that determines the maximum
# number of connections in the connection pool used for the cluster state
# database (the change feed uses an additional connection), defaulting to
# a value that depends on the number of available CPUs.
#
# If your certificates are not stored at the default ~/.postgresql
# location, you will need to specify them with the sslcert, sslkey, and
# sslrootcert parameters.
conn_string: postgresql://user_name@database-address/teleport_backend?sslmode=verify-full&pool_max_conns=20
# In certain managed environments it can be necessary or convenient to
# use a different user or different settings for the connection used
# to set up and make use of logical decoding. If specified, Teleport
# will use the connection string in change_feed_conn_string for that,
# instead of the one in conn_string. Available in Teleport 13.4 and later.
change_feed_conn_string: postgresql://replication_user_name@database-address/teleport_backend?sslmode=verify-full
# An audit_events_uri with a scheme of postgresql:// will use the
# PostgreSQL backend for audit log storage; the URI is a libpq-compatible
# connection string just like the cluster state conn_string, but cannot be
# specified as key=value pairs. It's possible to specify completely
# different PostgreSQL clusters for cluster state and audit log.
#
# If your certificates are not stored at the default ~/.postgresql
# location, you will need to specify them with the sslcert, sslkey, and
# sslrootcert parameters.
audit_events_uri:
- postgresql://user_name@database-address/teleport_audit?sslmode=verify-full
Audit log events are periodically deleted after a default retention period of
8766 hours (one year); it's possible to select a different retention period or
to disable the cleanup entirely, by specifying the
retention_period or the
disable_cleanup parameters in the fragment of the URI:
teleport:
storage:
audit_events_uri:
- postgresql://user_name@database-address/teleport_audit?sslmode=verify-full#disable_cleanup=false&retention_period=2160h
Authentication
We strongly recommend using client certificates to authenticate Teleport to PostgreSQL, as well as enforcing the use of TLS and verifying the server certificate on the client side.
You will need to update your
pg_hba.conf file to include the following lines
to ensure connections to Teleport use client certificates. See The
pg_hba.conf
file in the PostgreSQL documentation for more details.
# TYPE DATABASE USER CIDR-ADDRESS METHOD
hostssl teleport all ::/0 cert
hostssl teleport all 0.0.0.0/0 cert
If the use of passwords is unavoidable, we recommend configuring them in the
~/.pgpass file
rather than storing them in Teleport's configuration file.
Microsoft Entra ID authentication
If you are running Teleport on Azure, Teleport can make use of Microsoft Entra ID authentication to connect to an Azure Database for PostgreSQL server without having to manage any secrets:
teleport:
storage:
type: postgresql
conn_string: postgresql://[email protected]/teleport_backend?sslmode=verify-full&pool_max_conns=20
auth_mode: azure
audit_events_uri:
- postgresql://[email protected]/teleport_audit?sslmode=verify-full#auth_mode=azure
When
auth_mode is set to
azure, Teleport will automatically fetch
short-lived tokens from the credentials available to it, to be used as database
passwords. The database user must be configured to allow connections using
Microsoft Entra ID.
Teleport will make use of the Microsoft Entra ID credentials specified by environment variables, Microsoft Entra Workload ID credentials, or managed identity credentials.
Google Cloud IAM authentication
If you are running Teleport on Google Cloud, Teleport can make use of IAM Authentication to connect to an GCP Cloud SQL for PostgreSQL without having to manage any secrets:
teleport:
storage:
type: postgresql
auth_mode: gcp-cloudsql
# GCP connection name has the format <project>:<location>:<instance>.
gcp_connection_name: project:location:instance
# The type of IP address to use for connecting to the Cloud SQL instance. Valid options are:
# - "" (default to "public")
# - "public"
# - "private"
# - "psc" (for Private Service Connect)
gcp_ip_type: public
# Leave host and port empty as they are not required.
conn_string: postgresql://[email protected]@/teleport_backend
audit_events_uri:
- postgresql://[email protected]@/teleport_audit#auth_mode=gcp-cloudsql&gcp_connection_name=project:location:instance&gcp_ip_type=public
To enable IAM authentication and logical replication for Cloud SQL, make sure
flags
cloudsql.iam_authentication and
cloudsql.logical_decoding are set to
on for the Cloud SQL instance. The database user must also have the
REPLICATION role attribute for using the logical decoding features. See set
up logical replication and
decoding
for more details.
In order for Teleport to use the Cloud SQL Go
Connector with IAM
authentication, the service account of the target database user must have "Cloud
SQL Client"/
roles/cloudsql.client and "Cloud SQL Instance
User"/
roles/cloudsql.instanceUser roles assigned to the service account.
Teleport will make use of the credentials specified through the
GOOGLE_APPLICATION_CREDENTIALS environment
variable,
Workload Identity
Federation
with service account impersonation, or service account credentials attached to
VMs.
If the service account used in the PostgreSQL connection string is different from the service account of the default credentials, Teleport will impersonate the service account used in the connection string as a Service Account Token Creator using the default credentials.
Development
If you are not ready to connect Teleport to a production instance of PostgreSQL, you can use the following instructions to set up a throwaway instance of PostgreSQL using Docker.
First copy the following script to disk and run it to generate the CA, client certificate, and server certificate used by Teleport and PostgreSQL to establish a secure mutually authenticated connection:
#!/bin/bash
# Create the certs directory.
mkdir -p ./certs
cd certs/
# Create CA key and self-signed certificate.
openssl genpkey -algorithm RSA -out ca.key
openssl req -x509 -new -key ca.key -out ca.crt -subj "/CN=root"
# Function to create certificates.
create_certificate() {
local name="$1"
local dns_name="$2"
openssl genpkey \
-algorithm RSA \
-out "${name}.key"
openssl req -new \
-key "${name}.key" \
-out "${name}.csr" \
-subj "/CN=${dns_name}"
openssl x509 -req \
-in "${name}.csr" \
-CA ca.crt \
-CAkey ca.key \
-out "${name}.crt" \
-extfile <(printf "subjectAltName=DNS:${dns_name}") \
-CAcreateserial
chmod 0600 "${name}.key"
}
# Create client certificate with SAN.
create_certificate "client" "teleport"
# Create server certificate with SAN.
create_certificate "server" "localhost"
echo "Certificates and keys generated successfully."
Next, create a
Dockerfile using the official PostgreSQL Docker
image and add
wal2json to it:
FROM postgres:15.0
RUN apt-get update
RUN apt-get install -y postgresql-15-wal2json
Create an
init.sql file that will ensure the Teleport user is created upon
startup of the container:
CREATE USER teleport WITH REPLICATION CREATEDB;
Create a
pg_hba.conf file to enforce certificate-based authentication for
connections to PostgreSQL:
# TYPE DATABASE USER CIDR-ADDRESS METHOD
local all all trust
hostssl all all ::/0 cert
hostssl all all 0.0.0.0/0 cert
Create a
postgresql.conf file that configures the WAL level and certificates
used for authentication:
listen_addresses = '*'
port = 5432
max_connections = 20
shared_buffers = 128MB
temp_buffers = 8MB
work_mem = 4MB
wal_level=logical
max_replication_slots=10
ssl=on
ssl_ca_file='/certs/ca.crt'
ssl_cert_file='/certs/server.crt'
ssl_key_file='/certs/server.key'
Start the PostgreSQL container with the following command:
docker run --rm --name postgres \
-e POSTGRES_DB=db \
-e POSTGRES_USER=user \
-e POSTGRES_PASSWORD=password \
-v $(pwd)/data:/var/lib/postgresql/data \
-v $(pwd)/certs:/certs \
-v $(pwd)/postgresql.conf:/etc/postgresql/postgresql.conf \
-v $(pwd)/pg_hba.conf:/etc/postgresql/pg_hba.conf \
-v $(pwd)/init.sql:/docker-entrypoint-initdb.d/init.sql \
-p 5432:5432 \
$(docker build -q .) \
postgres \
-c hba_file=/etc/postgresql/pg_hba.conf \
-c config_file=/etc/postgresql/postgresql.conf
Lastly, update the storage section in
teleport.yaml to use PostgreSQL and
start Teleport:
teleport:
storage:
type: postgresql
conn_string: "postgresql://teleport@localhost:5432/teleport_backend?sslcert=/path/to/certs/client.crt&sslkey=/path/to/certs/client.key&sslrootcert=/path/to/certs/ca.crt&sslmode=verify-full&pool_max_conns=20"
S3 (Session Recordings)
Teleport supports using S3 as a backend for both session recordings and audit logs. S3 cannot be used as the cluster state backend. This section covers the use of S3 as a session recording backend. For information on using S3 for audit logs, see the Athena section.
S3 buckets must have versioning enabled, which ensures that a session log cannot be permanently altered or deleted. Teleport will always look at the oldest version of a recording.
Authenticating to AWS
The Teleport Auth Service must be able to read AWS credentials in order to authenticate to S3.
Grant the Teleport Auth Service access to credentials that it can use to authenticate to AWS.
- If you are running the Teleport Auth Service on an EC2 instance, you may use the EC2 Instance Metadata Service method
- If you are running the Teleport Auth Service in Kubernetes, you can use IAM Roles for Service Accounts (IRSA)
- Otherwise, you must use environment variables
- Instance Metadata Service
- Kubernetes IRSA
- Environment Variables
Teleport will detect when it is running on an EC2 instance and use the Instance Metadata Service to fetch credentials.
The EC2 instance should be configured to use an EC2 instance profile. For more information, see: Using Instance Profiles.
Refer to IAM Roles for Service Accounts (IRSA) to set up an OIDC provider in AWS and configure an AWS IAM role that allows the pod's service account to assume the role.
Teleport's built-in AWS client reads credentials from the following environment variables:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
When you start the Teleport Auth Service, the service reads environment variables from a
file at the path
/etc/default/teleport. Obtain these credentials from your
organization. Ensure that
/etc/default/teleport has the following content,
replacing the values of each variable:
AWS_ACCESS_KEY_ID=00000000000000000000
AWS_SECRET_ACCESS_KEY=0000000000000000000000000000000000000000
AWS_DEFAULT_REGION=<YOUR_REGION>
Have multiple sources of AWS credentials?
Teleport's AWS client loads credentials from different sources in the following order:
- Environment Variables
- Shared credentials file
- Shared configuration file (Teleport always enables shared configuration)
- EC2 Instance Metadata (credentials only)
While you can provide AWS credentials via a shared credentials file or shared
configuration file, you will need to run the Teleport Auth Service with the
AWS_PROFILE
environment variable assigned to the name of your profile of choice.
If you have a specific use case that the instructions above do not account for, consult the documentation for the AWS SDK for Go for a detailed description of credential loading behavior.
Configuring the S3 backend
Below is an example of how to configure the Teleport Auth Service to store the recorded sessions in an S3 bucket.
teleport:
storage:
# The region setting sets the default AWS region for all AWS services
# Teleport may consume (DynamoDB, S3)
region: us-east-1
# Path to S3 bucket to store the recorded sessions in.
audit_sessions_uri: "s3://Example_TELEPORT_S3_BUCKET/records"
# Teleport assumes credentials. Using provider chains, assuming IAM role or
# standard .aws/credentials in the home folder.
You can add optional query parameters to the S3 URL. The Teleport Auth Service reads these parameters to configure its interactions with S3:
s3://bucket/path?region=us-east-1&endpoint=mys3.example.com&insecure=false&disablesse=false&acl=private&use_fips_endpoint=true
-
region=us-east-1- set the Amazon region to use.
-
endpoint=mys3.example.com- connect to a custom S3 endpoint. Optional.
-
insecure=true- set to
trueor
false. If
true, TLS will be disabled. Default value is
false.
-
disablesse=true- set to
trueor
false. The Auth Service checks this value before uploading an object to an S3 bucket.
If this is
false, the Auth Service will set the server-side encryption configuration of the upload to use AWS Key Management Service and, if
sse_kms_keyis set, configure the upload to use this key.
If this value is
true, the Auth Service will not set an explicit server-side encryption configuration for the object upload, meaning that the upload will use the bucket-level server-side encryption configuration.
-
sse_kms_key=kms_key_id- If set to a valid AWS KMS CMK key ID, all objects uploaded to S3 will be encrypted with this key (as long as
disablesseis
false). Details can be found below.
-
acl=private- set the canned ACL to use. Must be one of the predefined ACL values.
-
use_fips_endpoint=true- Configure S3 FIPS endpoints
-
use_s3_virtual_style_addressing- Whether to use virtual-host-style instead of path-style URLs for the bucket. Only applies when a custom endpoint is set. Defaults to false when unset. If used without a custom endpoint set, this option has no effect.
-
complete_initiators- When specified, Teleport will only complete uploads initiated by the specified set of initiators. This is helpful in scenarios where software other than Teleport is initiating multipart uploads in the recordings bucket. This should be set to the display name of the initiator(s) you want to allow.
S3 IAM policy
On startup, the Teleport Auth Service checks whether the S3 bucket you have configured for session recording storage exists. If it does not, the Auth Service attempts to create and configure the bucket.
The IAM permissions that the Auth Service requires to manage its session recording bucket depends on whether you expect to create the bucket yourself or enable the Auth Service to create and configure it for you:
- Manage the Bucket Yourself
- Auth Service Creates a Bucket
Note that Teleport will only use S3 buckets with versioning enabled. This ensures that a session log cannot be permanently altered or deleted, as Teleport will always look at the oldest version of a recording.
You'll need to replace these values in the policy example below:
|Placeholder value
|Replace with
|your-sessions-bucket
|Name to use for the Teleport S3 session recording bucket
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BucketActions",
"Effect": "Allow",
"Action": [
"s3:ListBucketVersions",
"s3:ListBucketMultipartUploads",
"s3:ListBucket",
"s3:GetEncryptionConfiguration",
"s3:GetBucketVersioning"
],
"Resource": "arn:aws:s3:::your-sessions-bucket"
},
{
"Sid": "ObjectActions",
"Effect": "Allow",
"Action": [
"s3:GetObjectVersion",
"s3:GetObjectRetention",
"s3:GetObject",
"s3:PutObject",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload"
],
"Resource": "arn:aws:s3:::your-sessions-bucket/*"
}
]
}
You'll need to replace these values in the policy example below:
|Placeholder value
|Replace with
|your-sessions-bucket
|Name to use for the Teleport S3 session recording bucket
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "BucketActions",
"Effect": "Allow",
"Action": [
"s3:PutEncryptionConfiguration",
"s3:PutBucketVersioning",
"s3:ListBucketVersions",
"s3:ListBucketMultipartUploads",
"s3:ListBucket",
"s3:GetEncryptionConfiguration",
"s3:GetBucketVersioning",
"s3:CreateBucket"
],
"Resource": "arn:aws:s3:::your-sessions-bucket"
},
{
"Sid": "ObjectActions",
"Effect": "Allow",
"Action": [
"s3:GetObjectVersion",
"s3:GetObjectRetention",
"s3:*Object",
"s3:ListMultipartUploadParts",
"s3:AbortMultipartUpload"
],
"Resource": "arn:aws:s3:::your-sessions-bucket/*"
}
]
}
S3 Server Side Encryption
Teleport supports using a custom AWS KMS Customer Managed Key for encrypting objects uploaded to S3. This allows you to restrict who can read objects like session recordings separately from those that have read access to a bucket by restricting key access.
The
sse_kms_key parameter above can be set to any valid KMS CMK ID corresponding to a symmetric standard spec KMS key.
Example template KMS key policies are provided below for common usage cases. IAM users do not have access to any
key by default. Permissions have to be explicitly granted in the policy.
Encryption/Decryption
This policy allows an IAM user to encrypt and decrypt objects. This allows a cluster auth to write and play back session recordings.
Replace
[iam-key-admin-arn] with the IAM ARN of the user(s) that should have
administrative key access and
[auth-node-iam-arn] with the IAM ARN
of the user the Teleport auth nodes are using.
{
"Id": "Teleport Encryption and Decryption",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Teleport CMK Admin",
"Effect": "Allow",
"Principal": {
"AWS": "[iam-key-admin-arn]"
},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Teleport CMK Auth",
"Effect": "Allow",
"Principal": {
"AWS": "[auth-node-iam-arn]"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
}
]
}
Encryption/Decryption with separate clusters
This policy allows specifying separate IAM users for encryption and decryption. This can be used to set up a multi cluster configuration where the main cluster cannot play back session recordings but only write them. A separate cluster authenticating as a different IAM user with decryption access can be used for playing back the session recordings.
Replace
[iam-key-admin-arn] with the IAM ARN of the user(s) that should have
administrative key access,
[iam-node-write-arn] with the IAM ARN of the user the
main write-only cluster auth nodes are using and
[iam-node-read-arn] with the
IAM ARN of the user used by the read-only cluster.
For this to work the second cluster has to be connected to the same audit log as the main cluster. This is needed to detect session recordings.
{
"Id": "Teleport Separate Encryption and Decryption",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Teleport CMK Admin",
"Effect": "Allow",
"Principal": {
"AWS": "[iam-key-admin-arn]"
},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Teleport CMK Auth Encrypt",
"Effect": "Allow",
"Principal": {
"AWS": "[auth-node-write-arn]"
},
"Action": [
"kms:Encrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Sid": "Teleport CMK Auth Decrypt",
"Effect": "Allow",
"Principal": {
"AWS": "[auth-node-read-arn]"
},
"Action": [
"kms:Decrypt",
"kms:DescribeKey"
],
"Resource": "*"
}
]
}
ACL example: transferring object ownership
If you are uploading from AWS account
A to a bucket owned by AWS account
B and want
A to retain ownership of the objects, you can take one of two approaches.
Without ACLs
If ACLs are disabled, object ownership will be set to
Bucket owner enforced and no action will be needed.
With ACLs
- Set object ownership to
Bucket owner preferred(under Permissions in the management console).
- Add
acl=bucket-owner-full-controlto
audit_sessions_uri.
To enforce the ownership transfer, set
B's bucket's policy to only allow uploads that include the
bucket-owner-full-control canned ACL.
{
"Version": "2012-10-17",
"Id": "[id]",
"Statement": [
{
"Sid": "[sid]",
"Effect": "Allow",
"Principal": {
"AWS": "[ARN of account A]"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::BucketName/*",
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
}
]
}
For more information, see the AWS Documentation.
DynamoDB
If you are running Teleport on AWS, you can use DynamoDB as a storage backend to achieve High Availability. DynamoDB backend supports two types of Teleport data:
- Cluster state
- Audit log events
Teleport uses DynamoDB and DynamoDB Streams endpoints for its storage backend management.
DynamoDB cannot store the recorded sessions. You are advised to use AWS S3 for that as shown above.
Authenticating to AWS
The Teleport Auth Service must be able to read AWS credentials in order to authenticate to DynamoDB.
Grant the Teleport Auth Service access to credentials that it can use to authenticate to AWS.
- If you are running the Teleport Auth Service on an EC2 instance, you may use the EC2 Instance Metadata Service method
- If you are running the Teleport Auth Service in Kubernetes, you can use IAM Roles for Service Accounts (IRSA)
- Otherwise, you must use environment variables
- Instance Metadata Service
- Kubernetes IRSA
- Environment Variables
Teleport will detect when it is running on an EC2 instance and use the Instance Metadata Service to fetch credentials.
The EC2 instance should be configured to use an EC2 instance profile. For more information, see: Using Instance Profiles.
Refer to IAM Roles for Service Accounts (IRSA) to set up an OIDC provider in AWS and configure an AWS IAM role that allows the pod's service account to assume the role.
Teleport's built-in AWS client reads credentials from the following environment variables:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
When you start the Teleport Auth Service, the service reads environment variables from a
file at the path
/etc/default/teleport. Obtain these credentials from your
organization. Ensure that
/etc/default/teleport has the following content,
replacing the values of each variable:
AWS_ACCESS_KEY_ID=00000000000000000000
AWS_SECRET_ACCESS_KEY=0000000000000000000000000000000000000000
AWS_DEFAULT_REGION=<YOUR_REGION>
Have multiple sources of AWS credentials?
Teleport's AWS client loads credentials from different sources in the following order:
- Environment Variables
- Shared credentials file
- Shared configuration file (Teleport always enables shared configuration)
- EC2 Instance Metadata (credentials only)
While you can provide AWS credentials via a shared credentials file or shared
configuration file, you will need to run the Teleport Auth Service with the
AWS_PROFILE
environment variable assigned to the name of your profile of choice.
If you have a specific use case that the instructions above do not account for, consult the documentation for the AWS SDK for Go for a detailed description of credential loading behavior.
The IAM role that the Teleport Auth Service authenticates as must have the policies specified in the next section.
IAM policies
Make sure that the IAM role assigned to Teleport is configured with sufficient access to DynamoDB.
On startup, the Teleport Auth Service checks whether the DynamoDB table you have specified in its configuration file exists. If the table does not exist, the Auth Service attempts to create one.
The IAM permissions that the Auth Service requires to manage DynamoDB tables depends on whether you expect to create a table yourself or enable the Auth Service to create and configure one for you:
- Manage a Table Yourself
- Auth Service Creates a Table
If you choose to manage DynamoDB tables yourself, you must take the following steps, which we will explain in more detail below:
- Create a cluster state table.
- Create an audit event table.
- Create an IAM policy and attach it to the Teleport Auth Service's IAM identity.
Create a cluster state table
The cluster state table must have the following attribute definitions:
|Name
|Type
HashKey
S
FullPath
S
The table must also have the following key schema elements:
|Name
|Type
HashKey
HASH
FullPath
RANGE
Create an audit event table
The audit event table must have the following attribute definitions:
|Name
|Type
SessionID
S
EventIndex
N
CreatedAtDate
S
CreatedAt
N
The table must also have the following key schema elements:
|Name
|Type
CreatedAtDate
HASH
CreatedAt
RANGE
Create and attach an IAM policy
Create the following IAM policy and attach it to the Teleport Auth Service's IAM identity.
You'll need to replace these values in the policy example below:
|Placeholder value
|Replace with
|us-west-2
|AWS region
|1234567890
|AWS account ID
|teleport-helm-backend
|DynamoDB table name to use for the Teleport backend
|teleport-helm-events
|DynamoDB table name to use for the Teleport audit log (must be different to the backend table)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ClusterStateStorage",
"Effect": "Allow",
"Action": [
"dynamodb:BatchWriteItem",
"dynamodb:UpdateTimeToLive",
"dynamodb:PutItem",
"dynamodb:DeleteItem",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:DescribeStream",
"dynamodb:UpdateItem",
"dynamodb:DescribeTimeToLive",
"dynamodb:DescribeTable",
"dynamodb:GetShardIterator",
"dynamodb:GetItem",
"dynamodb:ConditionCheckItem",
"dynamodb:UpdateTable",
"dynamodb:GetRecords",
"dynamodb:UpdateContinuousBackups"
],
"Resource": [
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-backend",
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-backend/stream/*"
]
},
{
"Sid": "ClusterEventsStorage",
"Effect": "Allow",
"Action": [
"dynamodb:BatchWriteItem",
"dynamodb:UpdateTimeToLive",
"dynamodb:PutItem",
"dynamodb:DescribeTable",
"dynamodb:DeleteItem",
"dynamodb:GetItem",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:UpdateItem",
"dynamodb:DescribeTimeToLive",
"dynamodb:UpdateTable",
"dynamodb:UpdateContinuousBackups"
],
"Resource": [
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-events",
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-events/index/*"
]
}
]
}
Note that you can omit the
dynamodb:UpdateContinuousBackups permission if
disabling continuous backups.
You'll need to replace these values in the policy example below:
|Placeholder value
|Replace with
|us-west-2
|AWS region
|1234567890
|AWS account ID
|teleport-helm-backend
|DynamoDB table name to use for the Teleport backend
|teleport-helm-events
|DynamoDB table name to use for the Teleport audit log (must be different to the backend table)
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ClusterStateStorage",
"Effect": "Allow",
"Action": [
"dynamodb:BatchWriteItem",
"dynamodb:UpdateTimeToLive",
"dynamodb:PutItem",
"dynamodb:DeleteItem",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:DescribeStream",
"dynamodb:UpdateItem",
"dynamodb:DescribeTimeToLive",
"dynamodb:CreateTable",
"dynamodb:DescribeTable",
"dynamodb:GetShardIterator",
"dynamodb:GetItem",
"dynamodb:ConditionCheckItem",
"dynamodb:UpdateTable",
"dynamodb:GetRecords",
"dynamodb:UpdateContinuousBackups"
],
"Resource": [
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-backend",
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-backend/stream/*"
]
},
{
"Sid": "ClusterEventsStorage",
"Effect": "Allow",
"Action": [
"dynamodb:CreateTable",
"dynamodb:BatchWriteItem",
"dynamodb:UpdateTimeToLive",
"dynamodb:PutItem",
"dynamodb:DescribeTable",
"dynamodb:DeleteItem",
"dynamodb:GetItem",
"dynamodb:Scan",
"dynamodb:Query",
"dynamodb:UpdateItem",
"dynamodb:DescribeTimeToLive",
"dynamodb:UpdateTable",
"dynamodb:UpdateContinuousBackups"
],
"Resource": [
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-events",
"arn:aws:dynamodb:us-west-2:1234567890:table/teleport-helm-events/index/*"
]
}
]
}
Configuring the DynamoDB backend
To configure Teleport to use DynamoDB:
- Configure all Teleport Auth servers to use DynamoDB backend in the "storage"
section of
teleport.yamlas shown below.
- Auth servers must be able to reach DynamoDB and DynamoDB Streams endpoints.
- Deploy up to two auth servers connected to DynamoDB storage backend.
- Deploy several proxy nodes.
- Make sure that all Teleport resource services have the
auth_serversconfiguration setting populated with the addresses of your cluster's Auth Service instances.
AWS can throttle DynamoDB if more than two processes are reading from the same stream's shard simultaneously, so you must not deploy more than two Auth Service instances that read from a DynamoDB backend. For details on DynamoDB Streams, read the AWS documentation.
teleport:
storage:
type: dynamodb
# Region location of dynamodb instance, https://docs.aws.amazon.com/en_pv/general/latest/gr/rande.html#ddb_region
region: us-east-1
# Name of the DynamoDB table. If it does not exist, Teleport will create it.
table_name: Example_TELEPORT_DYNAMO_TABLE_NAME
# This setting configures Teleport to send the audit events to three places:
# To keep a copy in DynamoDB, a copy on a local filesystem, and also output the events to stdout.
# NOTE: The DynamoDB events table has a different schema to the regular Teleport
# database table, so attempting to use the same table for both will result in errors.
# When using highly available storage like DynamoDB, you should make sure that the list always specifies
# the High Availability storage method first, as this is what the Teleport web UI uses as its source of events to display.
audit_events_uri: ['dynamodb://events_table_name', 'file:///var/lib/teleport/audit/events', 'stdout://']
# This setting configures Teleport to save the recorded sessions in an S3 bucket:
audit_sessions_uri: s3://Example_TELEPORT_S3_BUCKET/records
# By default, Teleport stores audit events with an AWS TTL of 1 year.
# This value can be configured as shown below. If set to 0 seconds, TTL is disabled.
retention_period: 365d
# Enables either Pay Per Request or Provisioned billing for the DynamoDB table. Set when Teleport creates the table.
# Possible values: "pay_per_request" and "provisioned"
# default: "pay_per_request"
billing_mode: "pay_per_request"
# continuous_backups is used to optionally enable continuous backups.
# default: false
continuous_backups: true
- Replace
us-east-1and
Example_TELEPORT_DYNAMO_TABLE_NAMEwith your own settings. Teleport will create the table automatically.
Example_TELEPORT_DYNAMO_TABLE_NAMEand
events_table_namemust be different DynamoDB tables. The schema is different for each. Using the same table name for both will result in errors.
- Audit log settings above are optional. If specified, Teleport will store the
audit log in DynamoDB and the session recordings must be stored in an S3
bucket, i.e. both
audit_xxxsettings must be present. If they are not set, Teleport will default to a local file system for the audit log, i.e.
/var/lib/teleport/logon an Auth Service instance.
The optional
GET parameters shown below control how Teleport interacts with a DynamoDB endpoint.
dynamodb://events_table_name?region=us-east-1&endpoint=dynamo.example.com&use_fips_endpoint=true
region=us-east-1- set the Amazon region to use.
endpoint=dynamo.example.com- connect to a custom S3 endpoint.
use_fips_endpoint=true- Configure DynamoDB FIPS endpoints.
DynamoDB Continuous Backups
When setting up DynamoDB it's important to enable backups so that cluster state can be restored if needed from a snapshot in the past.
DynamoDB On-Demand
For best performance it is recommended to use On-Demand mode instead of configuring capacity manually via Provisioned mode. This helps prevent any DynamoDB throttling due to underestimated usage or increased usage from impacting Teleport.
Configuring AWS FIPS endpoints
This config option applies to Amazon S3 and Amazon DynamoDB.
Set
use_fips_endpoint to
true or
false. If
true, FIPS Dynamo endpoints will be used.
If
false, normal Dynamo endpoints will be used. If unset, the AWS Environment Variable
AWS_USE_FIPS_ENDPOINT will determine which endpoint is used.
FIPS endpoints will also be used if Teleport is run with the
--fips flag.
Config option priority is applied in the following order:
- Setting the
use_fips_endpointquery parameter as shown above
- Using the
--fipsflag when running Teleport
- Using the AWS environment variable
Setting this environment variable to true will enable FIPS endpoints for all AWS resource types. Some FIPS endpoints are not supported in certain regions or environments or are only supported in GovCloud.
Athena
The Athena audit log backend is available starting from Teleport v14.0.
If you are running Teleport on AWS, you can use an Athena-based audit log system that manages Parquet files stored on S3 as a storage backend to achieve high availability. The Athena backend supports only one type of Teleport data, audit events.
The Athena audit backend is better at scale and search than DynamoDB.
The Athena audit logs are eventually consistent. It may take up to one minute
(depending on the
batchMaxInterval setting and event load) until you can view
events in the Teleport Web UI.
Infrastructure setup
The Auth Service uses an SQS queue subscribed to an SNS topic for event publishing. A single Auth Service instance reads events in batches from SQS, converts them into Parquet format, and sends the resulting data to S3. During queries, the Athena engine searches for events on S3, reading metadata from a Glue table.
You can set up the required infrastructure to support the Athena backend with the following Terraform script:
Terraform script"
variable "aws_region" {
description = "AWS region"
default = "us-west-2"
}
variable "sns_topic_name" {
description = "Name of the SNS topic used for publishing audit events"
}
variable "sqs_queue_name" {
description = "Name of the SQS queue used for subscription for audit events topic"
}
variable "sqs_dlq_name" {
description = "Name of the SQS Dead-Letter Queue used for handling unprocessable events"
}
variable "max_receive_count" {
description = "Number of times a message can be received before it is sent to the DLQ"
default = 10
}
variable "kms_key_alias" {
description = "The alias of a custom KMS key"
}
variable "long_term_bucket_name" {
description = "Name of the long term storage bucket used for storing audit events"
}
variable "transient_bucket_name" {
description = "Name of the transient storage bucket used for storing query results and large events payloads"
}
variable "database_name" {
description = "Name of Glue database"
}
variable "table_name" {
description = "Name of Glue table"
}
variable "workgroup" {
description = "Name of Athena workgroup"
}
variable "workgroup_max_scanned_bytes_per_query" {
description = "Limit per query of max scanned bytes"
default = 1073741824 # 1GB
}
# search_event_limiter variables allows to configured rate limit on top of
# search events API to prevent increasing costs in case of aggressive use of API.
# In current version Athena Audit logger is not prepared for polling of API.
# Burst=20, time=1m and amount=5, means that you can do 20 requests without any
# throttling, next requests will be throttled, and tokens will be filled to
# rate limit bucket at amount 5 every 1m.
variable "search_event_limiter_burst" {
description = "Number of tokens available for rate limit used on top of search event API"
default = 20
}
variable "search_event_limiter_time" {
description = "Duration between the addition of tokens to the bucket for rate limit used on top of search event API"
default = "1m"
}
variable "search_event_limiter_amount" {
description = "Number of tokens added to the bucket during specific interval for rate limit used on top of search event API"
default = 5
}
variable "access_monitoring_trusted_relationship_role_arn" {
description = "AWS Role ARN that will be used to configure trusted relationship between provided role and Access Monitoring role allowing to assume Access Monitoring role by the provided role"
default = ""
}
variable "access_monitoring" {
description = "Enabled Access Monitoring"
type = bool
default = false
}
variable "access_monitoring_prefix" {
description = "Prefix for resources created by Access Monitoring"
default = ""
}
provider "aws" {
region = var.aws_region
}
data "aws_caller_identity" "current" {}
resource "aws_kms_key" "audit_key" {
description = "KMS key for Athena audit log"
enable_key_rotation = true
}
resource "aws_kms_key_policy" "audit_key_policy" {
key_id = aws_kms_key.audit_key.id
policy = jsonencode({
Statement = [
{
Action = [
"kms:*"
]
Effect = "Allow"
Principal = {
AWS = data.aws_caller_identity.current.account_id
}
Resource = "*"
Sid = "Default Policy"
},
{
Action = [
"kms:GenerateDataKey",
"kms:Decrypt"
]
Effect = "Allow"
Principal = {
Service = "sns.amazonaws.com"
}
Resource = "*"
Sid = "SnsUsage"
Condition = {
StringEquals = {
"aws:SourceAccount" = data.aws_caller_identity.current.account_id
}
ArnLike = {
"aws:SourceArn" : aws_sns_topic.audit_topic.arn
}
}
},
]
Version = "2012-10-17"
})
}
resource "aws_kms_alias" "audit_key_alias" {
name = "alias/${var.kms_key_alias}"
target_key_id = aws_kms_key.audit_key.key_id
}
resource "aws_sns_topic" "audit_topic" {
name = var.sns_topic_name
kms_master_key_id = aws_kms_key.audit_key.arn
}
resource "aws_sqs_queue" "audit_queue_dlq" {
name = var.sqs_dlq_name
kms_master_key_id = aws_kms_key.audit_key.arn
kms_data_key_reuse_period_seconds = 300
message_retention_seconds = 604800 // 7 days which is three days longer than default 4 of sqs queue
}
resource "aws_sqs_queue" "audit_queue" {
name = var.sqs_queue_name
kms_master_key_id = aws_kms_key.audit_key.arn
kms_data_key_reuse_period_seconds = 300
redrive_policy = jsonencode({
deadLetterTargetArn = aws_sqs_queue.audit_queue_dlq.arn
maxReceiveCount = var.max_receive_count
})
}
resource "aws_sns_topic_subscription" "audit_sqs_target" {
topic_arn = aws_sns_topic.audit_topic.arn
protocol = "sqs"
endpoint = aws_sqs_queue.audit_queue.arn
raw_message_delivery = true
}
data "aws_iam_policy_document" "audit_policy" {
statement {
actions = [
"SQS:SendMessage",
]
effect = "Allow"
principals {
type = "Service"
identifiers = ["sns.amazonaws.com"]
}
resources = [aws_sqs_queue.audit_queue.arn]
condition {
test = "ArnEquals"
variable = "aws:SourceArn"
values = [aws_sns_topic.audit_topic.arn]
}
}
}
resource "aws_sqs_queue_policy" "audit_policy" {
queue_url = aws_sqs_queue.audit_queue.url
policy = data.aws_iam_policy_document.audit_policy.json
}
resource "aws_s3_bucket" "long_term_storage" {
bucket = var.long_term_bucket_name
force_destroy = true
# On production we recommend enabling object lock to provide deletion protection.
object_lock_enabled = false
}
resource "aws_s3_bucket_server_side_encryption_configuration" "long_term_storage" {
bucket = aws_s3_bucket.long_term_storage.id
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = aws_kms_key.audit_key.arn
sse_algorithm = "aws:kms"
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_ownership_controls" "long_term_storage" {
bucket = aws_s3_bucket.long_term_storage.id
rule {
object_ownership = "BucketOwnerEnforced"
}
}
resource "aws_s3_bucket_versioning" "long_term_storage" {
bucket = aws_s3_bucket.long_term_storage.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_public_access_block" "long_term_storage" {
bucket = aws_s3_bucket.long_term_storage.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_s3_bucket" "transient_storage" {
bucket = var.transient_bucket_name
force_destroy = true
# On production we recommend enabling lifecycle configuration to clean transient data.
}
resource "aws_s3_bucket_server_side_encryption_configuration" "transient_storage" {
bucket = aws_s3_bucket.transient_storage.id
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = aws_kms_key.audit_key.arn
sse_algorithm = "aws:kms"
}
bucket_key_enabled = true
}
}
resource "aws_s3_bucket_ownership_controls" "transient_storage" {
bucket = aws_s3_bucket.transient_storage.id
rule {
object_ownership = "BucketOwnerEnforced"
}
}
resource "aws_s3_bucket_versioning" "transient_storage" {
bucket = aws_s3_bucket.transient_storage.id
versioning_configuration {
status = "Enabled"
}
}
resource "aws_s3_bucket_public_access_block" "transient_storage" {
bucket = aws_s3_bucket.transient_storage.id
block_public_acls = true
block_public_policy = true
ignore_public_acls = true
restrict_public_buckets = true
}
resource "aws_glue_catalog_database" "audit_db" {
name = var.database_name
}
resource "aws_glue_catalog_table" "audit_table" {
name = var.table_name
database_name = aws_glue_catalog_database.audit_db.name
table_type = "EXTERNAL_TABLE"
parameters = {
"EXTERNAL" = "TRUE",
"projection.enabled" = "true",
"projection.event_date.type" = "date",
"projection.event_date.format" = "yyyy-MM-dd",
"projection.event_date.interval" = "1",
"projection.event_date.interval.unit" = "DAYS",
"projection.event_date.range" = "NOW-4YEARS,NOW",
"storage.location.template" = format("s3://%s/events/$${event_date}/", aws_s3_bucket.long_term_storage.bucket)
"classification" = "parquet"
"parquet.compression" = "SNAPPY",
}
storage_descriptor {
location = format("s3://%s", aws_s3_bucket.long_term_storage.bucket)
input_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"
ser_de_info {
name = "example"
parameters = { "serialization.format" = "1" }
serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
}
columns {
name = "uid"
type = "string"
}
columns {
name = "session_id"
type = "string"
}
columns {
name = "event_type"
type = "string"
}
columns {
name = "event_time"
type = "timestamp"
}
columns {
name = "event_data"
type = "string"
}
columns {
name = "user"
type = "string"
}
}
partition_keys {
name = "event_date"
type = "date"
}
}
resource "aws_athena_workgroup" "workgroup" {
name = var.workgroup
force_destroy = true
configuration {
bytes_scanned_cutoff_per_query = var.workgroup_max_scanned_bytes_per_query
engine_version {
selected_engine_version = "Athena engine version 3"
}
result_configuration {
output_location = format("s3://%s/results", aws_s3_bucket.transient_storage.bucket)
encryption_configuration {
encryption_option = "SSE_KMS"
kms_key_arn = aws_kms_key.audit_key.arn
}
}
}
}
output "athena_url" {
value = format("athena://%s.%s?%s",
aws_glue_catalog_database.audit_db.name,
aws_glue_catalog_table.audit_table.name,
join("&", [
format("topicArn=%s", aws_sns_topic.audit_topic.arn),
format("largeEventsS3=s3://%s/large_payloads", aws_s3_bucket.transient_storage.bucket),
format("locationS3=s3://%s/events", aws_s3_bucket.long_term_storage.bucket),
format("workgroup=%s", aws_athena_workgroup.workgroup.name),
format("queueURL=%s", aws_sqs_queue.audit_queue.url),
format("queryResultsS3=s3://%s/query_results", aws_s3_bucket.transient_storage.bucket),
format("limiterBurst=%d", var.search_event_limiter_burst),
format("limiterRefillAmount=%s", var.search_event_limiter_amount),
format("limiterRefillTime=%s", var.search_event_limiter_time),
])
)
}
Configuring the Athena audit log backend
To configure Teleport to use Athena:
- Make sure you are using Teleport version 14.0.0 or newer.
- Prepare infrastructure
- Specify an Athena URL inside the
audit_events_uriarray in your Teleport configuration file:
teleport:
storage:
# This setting configures Teleport to keep a copy of the audit log in Athena
# and a copy on a local filesystem, and also to output the events to stdout.
audit_events_uri:
# More details about the full Athena URL are shown below.
- 'athena://database.table?params'
- 'file:///var/lib/teleport/audit/events'
- 'stdout://'
Here is an example of an Amazon Athena URL within the
audit_events_uri configuration field:
athena://db.table?topicArn=arn:aws:sns:region:account_id:topic_name&largeEventsS3=s3://transient/large_payloads&locationS3=s3://long-term/events&workgroup=workgroup&queueURL=https://sqs.region.amazonaws.com/account_id/queue_name&queryResultsS3=s3://transient/query_results
The URL hostname consist of
database.table, which points to the Glue database
and a table which will be used by the Athena audit logger.
Other parameters are specified as query parameters within the Athena URL.
The following parameters are required:
|Parameter name
|Example value
|Description
topicArn
arn:aws:sns:region:account_id:topic_name
|ARN of SNS topic where events are published
locationS3
s3://long-term/events
|S3 bucket used for long-term storage
largeEventsS3
s3://transient/large_payloads
|S3 bucket used for transient storage for large events
queueURL
https://sqs.region.amazonaws.com/account_id/queue_name
|SQS URL used for a subscription to an SNS topic
workgroup
workgroup_name
|Athena workgroup used for queries
queryResultsS3
s3://transient/results
|S3 bucket used for transient storage for query results
The following parameters are optional:
|Parameter name
|Example value
|Description
region
us-east-1
|AWS region. If empty, defaults to one from the AuditConfig or ambient AWS credentials
batchMaxItems
20000
|defines the maximum number of events allowed for a single Parquet file (default 20000)
batchMaxInterval
1m
|defines the maximum interval used to buffer incoming data before creating a Parquet file (default 1m)
Authenticating to AWS
The Teleport Auth Service must be able to read AWS credentials in order to authenticate to Athena.
Grant the Teleport Auth Service access to credentials that it can use to authenticate to AWS.
- If you are running the Teleport Auth Service on an EC2 instance, you may use the EC2 Instance Metadata Service method
- If you are running the Teleport Auth Service in Kubernetes, you can use IAM Roles for Service Accounts (IRSA)
- Otherwise, you must use environment variables
- Instance Metadata Service
- Kubernetes IRSA
- Environment Variables
Teleport will detect when it is running on an EC2 instance and use the Instance Metadata Service to fetch credentials.
The EC2 instance should be configured to use an EC2 instance profile. For more information, see: Using Instance Profiles.
Refer to IAM Roles for Service Accounts (IRSA) to set up an OIDC provider in AWS and configure an AWS IAM role that allows the pod's service account to assume the role.
Teleport's built-in AWS client reads credentials from the following environment variables:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_DEFAULT_REGION
When you start the Teleport Auth Service, the service reads environment variables from a
file at the path
/etc/default/teleport. Obtain these credentials from your
organization. Ensure that
/etc/default/teleport has the following content,
replacing the values of each variable:
AWS_ACCESS_KEY_ID=00000000000000000000
AWS_SECRET_ACCESS_KEY=0000000000000000000000000000000000000000
AWS_DEFAULT_REGION=<YOUR_REGION>
Have multiple sources of AWS credentials?
Teleport's AWS client loads credentials from different sources in the following order:
- Environment Variables
- Shared credentials file
- Shared configuration file (Teleport always enables shared configuration)
- EC2 Instance Metadata (credentials only)
While you can provide AWS credentials via a shared credentials file or shared
configuration file, you will need to run the Teleport Auth Service with the
AWS_PROFILE
environment variable assigned to the name of your profile of choice.
If you have a specific use case that the instructions above do not account for, consult the documentation for the AWS SDK for Go for a detailed description of credential loading behavior.
The IAM role that the Teleport Auth Service authenticates as must have the policies specified in the next section.
IAM policies
Make sure that the IAM role assigned to Teleport is configured with sufficient access to Athena. Below you can find the IAM permissions that the Auth Service requires to use Athena Audit logs as an audit event backend.
You'll need to replace these values in the policy example below:
|Placeholder value
|Replace with
eu-central-1
|AWS region
1234567890
|AWS account ID
audit-long-term
|S3 bucket used for long-term storage
audit-transient
|S3 bucket used for transient storage
audit-sqs
|SNS topic name
audit-sns
|SQS name
kms_id
|KMS key ID used for server-side encryption of SNS/SQS/S3
audit_db
|Glue database used for audit logs
audit_table
|Glue table used for audit logs
audit_workgroup
|Athena workgroup used for audit logs
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:ListBucketMultipartUploads",
"s3:GetBucketLocation",
"s3:ListBucketVersions",
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::audit-transient",
"arn:aws:s3:::audit-long-term"
],
"Sid": "AllowListingMultipartUploads"
},
{
"Action": [
"s3:PutObject",
"s3:ListMultipartUploadParts",
"s3:GetObjectVersion",
"s3:GetObject",
"s3:DeleteObjectVersion",
"s3:DeleteObject",
"s3:AbortMultipartUpload"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::audit-transient/results/*",
"arn:aws:s3:::audit-transient/large_payloads/*",
"arn:aws:s3:::audit-long-term/events/*"
],
"Sid": "AllowMultipartAndObjectAccess"
},
{
"Action": "sns:Publish",
"Effect": "Allow",
"Resource": "arn:aws:sns:eu-central-1:1234567890:audit-sns",
"Sid": "AllowPublishSNS"
},
{
"Action": [
"sqs:ReceiveMessage",
"sqs:DeleteMessage"
],
"Effect": "Allow",
"Resource": "arn:aws:sqs:eu-central-1:1234567890:audit-sqs",
"Sid": "AllowReceiveSQS"
},
{
"Action": [
"glue:GetTable",
"athena:StartQueryExecution",
"athena:GetQueryResults",
"athena:GetQueryExecution"
],
"Effect": "Allow",
"Resource": [
"arn:aws:glue:eu-central-1:1234567890:table/audit_db/audit_table",
"arn:aws:glue:eu-central-1:1234567890:database/audit_db",
"arn:aws:glue:eu-central-1:1234567890:catalog",
"arn:aws:athena:eu-central-1:1234567890:workgroup/audit_workgroup"
],
"Sid": "AllowAthenaQuery"
},
{
"Action": [
"kms:GenerateDataKey",
"kms:Decrypt"
],
"Effect": "Allow",
"Resource": "arn:aws:kms:eu-central-1:1234567890:key/kms_id",
"Sid": "AllowAthenaKMSUsage"
}
]
}
Migration from Dynamo to the Athena audit logs backend
Migration is only needed if you used Amazon DynamoDB for audit logs and you want to keep old data.
Migration consist of following steps:
- Set up Athena infrastructure
- Dual write to both DynamoDB and Athena, and query from DynamoDB
- Migrate old data from DynamoDB to Athena
- Dual write to both DynamoDB and Athena, and query from Athena
- Disable writing to DynamoDB
In the Teleport storage configuration,
audit_events_uri accepts multiple
URLs. Those URLs are used to configure connections to the different audit
loggers. If more than 1 is used, then events are written to each audit system,
and queries are executed from first one.
If anything goes wrong during migration steps 1-4, roll back to the Amazon
DynamoDB solution by making sure its URL is the first value in the
audit_events_uri field and removing the Athena URL.
Each of these steps is explained in more detail below.
Dual write to both DynamoDB and Athena, and query from DynamoDB
The second step of migration requires setting the following configuration:
teleport:
storage:
audit_events_uri:
- 'dynamodb://events_table_name'
- 'athena://db.table?otherQueryParams'
When an Auth Service instance is restarted, you should verify that Parquet files
are stored in the S3 bucket specified using the
locationS3 parameter.
Migrate old data from DynamoDB to Athena
This step requires using the client machine to export data from Amazon DynamoDB and publish it to the Athena logger. We recommend using, for example, an EC2 instance with a disk size at least 2x bigger than the table size in Amazon DynamoDB.
Instructions for how to use the migration tool can be found on GitHub.
You should set
exportTime to the time when dual writing began.
We recommend running your first migration with the
-dry-run flag because it
validates the exported data. If no errors are reported, proceed to a real
migration without the
-dry-run flag.
Dual write to both DynamoDB and Athena, and query from Athena
Change the order of the
audit_events_uri values in your Teleport
configuration file:
teleport:
storage:
audit_events_uri:
- 'athena://db.table?otherQueryParams'
- 'dynamodb://events_table_name'
When the Auth Service is restarted, you should verify that events are visible on the Audit Logs page.
Disable writing to DynamoDB
Disabling writing to DynamoDB means that you won't be able to roll back to DynamoDB without losing data. Dual writing to both Athena and DynamoDB does not have a significant performance impact, and it's recommended to keep dual writing for some time, even if your system already executes queries from Athena.
To disable writing to DynamoDB, remove the DynamoDB URL from the
audit_events_uri array.
GCS
Google Cloud Storage (GCS) can be used as storage for recorded sessions. GCS cannot store the audit log or the cluster state. Below is an example of how to configure a Teleport Auth Service to store the recorded sessions in a GCS bucket.
teleport:
storage:
# Path to GCS to store the recorded sessions in.
audit_sessions_uri: 'gs://$BUCKET_NAME/records?projectID=$PROJECT_ID&credentialsPath=$CREDENTIALS_PATH'
We recommend creating a bucket in
Dual-Region mode with the
Standard storage class to ensure cluster performance and high availability.
Replace the following variables in the above example with your own values:
-
$BUCKET_NAMEwith the name of the desired GCS bucket. If the bucket does not exist it will be created. Please ensure the following permissions are granted for the given bucket:
storage.buckets.get
storage.objects.create
storage.objects.get
storage.objects.list
storage.objects.update
storage.objects.delete
storage.objects.deleteis required in order to clean up multipart files after they have been assembled into the final blob.
If the bucket does not exist, please also ensure that the
storage.buckets.createpermission is granted.
-
-
$PROJECT_IDwith a GCS-enabled GCP project.
-
$CREDENTIALS_PATHwith the path to a JSON-formatted GCP credentials file configured for a service account applicable to the project.
Firestore
If you are running Teleport on GCP, you can use Firestore as a storage backend to achieve high availability. Firestore backend supports two types of Teleport data:
- Cluster state
- Audit log events
Firestore cannot store the recorded sessions. You are advised to use Google Cloud Storage (GCS) for that as shown above. To configure Teleport to use Firestore:
- Configure all Teleport Auth servers to use Firestore backend in the "storage"
section of
teleport.yamlas shown below.
- Deploy several auth servers connected to Firestore storage backend.
- Deploy several proxy nodes.
- Make sure that all Teleport resource services have the
auth_serversconfiguration setting populated with the addresses of your cluster's Auth Service instances or use a load balancer for Auth Service instances in high availability mode.
teleport:
storage:
type: firestore
# Project ID https://support.google.com/googleapi/answer/7014113?hl=en
project_id: Example_GCP_Project_Name
# Name of the Firestore table.
collection_name: Example_TELEPORT_FIRESTORE_TABLE_NAME
# An optional database id to use. If not provided the default
# database for the project is used.
database_id: Example_TELEPORT_FIRESTORE_DATABASE_ID
credentials_path: /var/lib/teleport/gcs_creds
# This setting configures Teleport to send the audit events to three places:
# To keep a copy in Firestore, a copy on a local filesystem, and also write the events to stdout.
# NOTE: The Firestore events table has a different schema to the regular Teleport
# database table, so attempting to use the same table for both will result in errors.
# When using highly available storage like Firestore, you should make sure that the list always specifies
# the High Availability storage method first, as this is what the Teleport web UI uses as its source of events to display.
audit_events_uri: ['firestore://Example_TELEPORT_FIRESTORE_EVENTS_TABLE_NAME?projectID=$PROJECT_ID&credentialsPath=$CREDENTIALS_PATH&databaseID=$DATABASE_ID', 'file:///var/lib/teleport/audit/events', 'stdout://']
# This setting configures Teleport to save the recorded sessions in GCP storage:
audit_sessions_uri: gs://Example_TELEPORT_GCS_BUCKET/records
- Replace
Example_GCP_Project_Nameand
Example_TELEPORT_FIRESTORE_TABLE_NAMEwith your own settings. Teleport will create the table automatically.
Example_TELEPORT_FIRESTORE_TABLE_NAMEand
Example_TELEPORT_FIRESTORE_EVENTS_TABLE_NAMEmust be different Firestore tables. The schema is different for each. Using the same table name for both will result in errors.
- The GCP authentication setting above can be omitted if the machine itself is running on a GCE instance with a Service Account that has access to the Firestore table.
- Audit log settings above are optional. If specified, Teleport will store the audit log in Firestore
and the session recordings must be stored in a GCS bucket, i.e. both
audit_xxxsettings must be present. If they are not set, Teleport will default to a local filesystem for the audit log, i.e.
/var/lib/teleport/logon an Auth Service instance.
Azure Blob Storage
Azure Blob Storage for session storage is available starting from Teleport
13.3.
Azure Blob Storage can be used as storage for recorded sessions. Azure Blob Storage cannot store the audit log or the cluster state. Below is an example of how to configure a Teleport Auth Service instance to store the recorded sessions in an Azure Blob Storage storage account.
teleport:
storage:
audit_sessions_uri: azblob://account-name.blob.core.windows.net
Teleport makes use of two containers in the account, whose names default to
inprogress and
session, but they can be configured with parameters in the
fragment of the URI.
teleport:
storage:
audit_sessions_uri: azblob://account-name.core.blob.windows.net#session_container=session_container_name&inprogress_container=inprogress_container_name
Permissions
Teleport needs the following permissions on the
inprogress container:
Microsoft.Storage/storageAccounts/blobServices/containers/blobs/read
Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write
Microsoft.Storage/storageAccounts/blobServices/containers/blobs/delete(only on the
inprogresscontainer)
In addition, Teleport will check if the containers exist at startup, and it will
attempt to create them if they can't be confirmed to exist; giving Teleport
Microsoft.Storage/storageAccounts/blobServices/containers/read will allow for
checking and
Microsoft.Storage/storageAccounts/blobServices/containers/write
will allow for creating them.
It's highly recommended to set up a time-based retention
policy
for the
session container, as well as a lifecycle management
policy,
so that recordings are kept in an immutable state for a given period, then
deleted. Teleport will not delete recordings automatically.
With a time-based retention policy in place, it's safe to give Teleport the "Blob Storage Data Contributor" role scoped to the containers, instead of having to define a custom role for it.
Authentication
Teleport will make use of the Microsoft Entra ID credentials specified by environment variables, Microsoft Entra Workload ID credentials, or managed identity credentials.
SQLite
The Auth Service uses the SQLite backend when no
type is specified in the
storage section in the Teleport configuration file, or when
type is set to
sqlite or
dir. The SQLite backend is not designed for high throughput and
it's not capable of serving the needs of Teleport's High Availability configurations.
If you are planning to use SQLite as your backend, scale your cluster slowly and
monitor the number of warning messages in the Auth Service's logs that say
SLOW TRANSACTION, as that's a sign that the cluster has outgrown the capabilities
of the SQLite backend.
As a stopgap measure until it's possible to migrate the cluster to use a HA-capable backend, you can configure the SQLite backend to reduce the amount of disk synchronization, in exchange for less resilience against system crashes or power loss. For an explanation on what the options mean, see the official SQLite docs. No matter the configuration, we recommend you take regular backups of your cluster state.
To reduce disk synchronization:
teleport:
storage:
type: sqlite
sync: NORMAL
To disable disk synchronization altogether:
teleport:
storage:
type: sqlite
sync: "OFF"
When running on a filesystem that supports file locks (i.e. a local filesystem, not a networked one) it's possible to also configure the SQLite database to use Write-Ahead Logging (see the official docs on WAL mode) for significantly improved performance without sacrificing reliability:
teleport:
storage:
type: sqlite
sync: NORMAL
journal: WAL
The SQLite backend and other required data will be written to the Teleport data directory.
By default, Teleport's data directory is
/var/lib/teleport. To modify
the location set the
data_dir value within the Teleport configuration file.
teleport:
data_dir: /var/lib/teleport_data
CockroachDB
Use of the CockroachDB storage backend requires Teleport Enterprise.
Teleport can use CockroachDB as a storage backend to achieve high availability and survive regional failures. You must take steps to protect access to CockroachDB in this configuration because that is where Teleport secrets like keys and user records will be stored.
At a minimum you must configure CockroachDB to allow Teleport to create tables. Teleport will create the database if given permission to do so but this is not required if the database already exists.
CREATE DATABASE database_name;
CREATE USER database_user;
GRANT CREATE ON DATABASE database_name TO database_user;
You must also enable change feeds in CockroachDB's cluster settings. Teleport
will configure this setting itself if granted
SYSTEM MODIFYCLUSTERSETTING.
SET CLUSTER SETTING kv.rangefeed.enabled = true;
There are several ways to deploy and configure CockroachDB, the details of which are not in scope for this guide. To learn about deploying CockroachDB, see CockroachDB's deployment options. To learn about how to configure multi-region survival goals, see multi-region survival goals.
To configure Teleport to use CockroachDB as a storage backend:
- Configure all Teleport Auth Service instances to use the CockroachDB backend in the
storagesection of
teleport.yamlas shown below.
- Deploy several Auth Service instances connected to the CockroachDB storage backend.
- Deploy several Proxy Service instances.
- Make sure that the Proxy Service instances and all Teleport Agent services that
connect directly to to the Auth Service have the
auth_serverconfiguration setting populated with the address of a load balancer for Auth Service instances.
teleport:
storage:
type: cockroachdb
# conn_string is a required parameter. It is a PostgreSQL connection string used
# to connect to CockroachDB using the PostgreSQL wire protocol. Client
# parameters may be specified using the URL. For a detailed list of available
# parameters see https://www.cockroachlabs.com/docs/stable/connection-parameter
#
# If your certificates are not stored at the default ~/.postgresql
# location, you will need to specify them with the sslcert, sslkey, and
# sslrootcert parameters.
#
# pool_max_conns is an additional parameter that determines the maximum
# number of connections in the connection pool used for the cluster state
# database (the change feed uses an additional connection), defaulting to
# a value that depends on the number of available CPUs.
conn_string: postgresql://user_name@database-address/teleport_backend?sslmode=verify-full&pool_max_conns=20
# change_feed_conn_string is an optional parameter. When unspecified Teleport
# will default to using the same value specified for conn_string. It may be used
# to configure Teleport to use a different user or connection parameters when
# establishing a change feed connection.
#
# If your certificates are not stored at the default ~/.postgresql
# location, you will need to specify them with the sslcert, sslkey, and
# sslrootcert parameters.
change_feed_conn_string: postgresql://user_name@database-address/teleport_backend?sslmode=verify-full
# ttl_job_cron is an optional parameter which configures the interval at which CockroachDB will expire backend
# items based on their time to live. By default this is configured to run every
# 20 minutes. This is used by Teleport to clean up old resources that are no longer
# connected to or needed by Teleport. Note that configuring this to run more
# frequently may have performance implications for CockroachDB.
ttl_job_cron: '*/20 * * * *'