Backup and Restore
Operators must have a plan in place to back up self-hosted Teleport clusters. While the Teleport Proxy Service and Teleport Agent services are stateless, you should ensure that you can restore their configuration files. The Teleport Auth Service manages state for the entire cluster, and it is critical that you can back up its data. This guide explains the components of a Teleport Auth Service deployment that must be backed up and lays out our recommended approach for performing backups.
Teleport Enterprise Cloud takes care of this setup for you so you can provide secure access to your infrastructure right away.
Get started with a free trial of Teleport Enterprise Cloud.
Data to back up
You must back up data that the Auth Service manages across its three backends:
- Cluster state backend
- Audit event backend
- Session recording backend
See Storage Backends for more information on Teleport backends.
The following table summarizes the kinds of backend data the Auth Service maintains:
|What
|Where
|Local Users (not SSO)
|Cluster state backend
|Certificate Authorities
|Cluster state backend
|Dynamic resources (more information below)
|Cluster state backend
|teleport.yaml
|File system
|teleport.service
|File system
|license.pem
|File system
|TLS key/certificate
|File system or third-party service (e.g., AWS Certificate Manager)
|Audit log
|Audit event backend
|Session recordings
|Session recording backend
Backing up Teleport backends
Your plan for backing up the Teleport cluster state and session recording backends depends on the solution you use for each backend. The following table includes instructions for backing up each backend solution. For backends not listed here, consult the documentation for your backend:
Cluster state and audit event backends
For the most part, you can use the same solution for both cluster state and audit events, setting aside a separate table for each kind of backend. The exception is etcd, which can only function as a cluster state backend. (See a full explanation in Storage Backends.)
|Backend
|Recommended backup strategy
|Local Filesystem
|Back up the data directory (
/var/lib/teleport/ by default)
|DynamoDB
|Amazon DynamoDB documentation
|etcd
|etcd documentation
|Firestore
|Firestore documentation
|Azure Database for PostgreSQL
|Azure Database for PostgreSQL documentation
|Cloud SQL for PostgreSQL
|Cloud SQL for PostgreSQL documentation
Session recording backends
|Backend
|Recommended backup strategy
|Local Filesystem
|Back up the data directory (
/var/lib/teleport/ by default)
|S3
|Amazon S3 documentation
|GCS
|GCS has built-in redundancy, but you may also use cross-bucket replication
|Azure Blob Storage
|Azure Blob backup documentation
Versioning dynamic resources with infrastructure as code
Teleport uses dynamic resources for roles, local users, authentication connectors, and other configurations, and stores dynamic resource data on the cluster state backend. Backing up the cluster state backend protects your cluster against the loss of dynamic resource data.
For more control over the version of a dynamic resource that you back up and restore, we recommend storing dynamic resource manifests in a code repository that uses a version control system. You can use a continuous deployment pipeline to apply configurations automatically, so your Teleport cluster always reflects the latest state of your dynamic resources.
If you need to revert to an earlier version of a resource (e.g., to correct a misconfiguration), you can restore the resource on your code repository without conducting a full restore of the cluster state backend.
Teleport provides the following infrastructure as code tools for managing dynamic resources:
Cloning a backend
You can clone a Teleport Auth Service backend by instructing Teleport to retrieve all items from one backend and store them in another backend. You can use this operation to, for example, migrate data to a new backend or back up data from one region to another.
This operation uses the
teleport backend clone command with a configuration
file that includes information about the source and destination backends. The
teleport process running the clone must have access to credentials for both
backends. If you complete the instructions below on a virtual machine or
Kubernetes pod that usually runs the Teleport Auth Service, you can expect the
teleport process to have the required permissions.
-
Write a configuration file for the clone. Create a file called
clone.yamlthat includes the following structure:
# src is the configuration for the backend where data is cloned from. src: type: dynamodb region: us-east-1 table: teleport_backend # dst is the configuration for the backend where data is cloned to. dst: type: sqlite path: /var/lib/teleport_data # parallel is the amount of backend data cloned in parallel. # If a clone operation is taking too long consider increasing this value. parallel: 100
This example clones backend data in Amazon DynamoDB to a SQLite database.
-
Update the
srcand
dstsections of the clone configuration file to include information about the source and destination backends. Possible values of
srcand
dstare the same as the
teleport.storagesection in the Teleport configuration file. See the Storage Backends reference for the configuration fields to assign for each backend.
-
Run the following command on an Auth Service instance to execute the clone operation. The value of the
-cflag is the configuration file you created earlier:sudo teleport backend clone -c clone.yaml
You can run the
teleport backend clone command on an Auth Service instance
without stopping your cluster. The command retrieves each item from the source
backend and writes it to the destination backend. Any items created on the
source backend after the initial retrieval will not be included in the clone.
Rollbacks
Rollbacks can be performed without any backend corruption or data loss by leveraging
teleport backend clone to create a copy of an existing backend. For example, prior
to upgrading a cluster from Teleport v18 to v19 the following clone configuration can
be used to make a full clone of the Teleport backend state to a different key range of
the same etcd cluster using the
prefix field. See the etcd storage backend
configuration reference for an explanation:
src:
type: etcd
prefix: teleport
peers: [https://peer.example.com:2379]
dst:
type: etcd
prefix: teleport-v18-backup
peers: [https://peer.example.com:2379]
parallel: 100
Teleport major releases may alter the backend state in a way that previous versions cannot reconcile. If the upgrade to v19 fails and requires a rollback, this can be achieved by editing the Auth Service configuration to point it at the prefix containing the v18 backup.
teleport:
storage:
type: etcd
prefix: teleport-v18-backup
peers: [https://peer.example.com:2379]