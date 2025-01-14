Version: 15.x

On this page

Cert Authority Rotation

A running Teleport cluster version 15.4.30 or above. If you want to get started with Teleport, sign up for a free trial or set up a demo environment.

The tctl admin tool and tsh client tool. On Teleport Enterprise, you must use the Enterprise version of tctl , which you can download from your Teleport account workspace. Otherwise, visit Installation for instructions on downloading tctl and tsh for Teleport Community Edition.

To check that you can connect to your Teleport cluster, sign in with tsh login , then verify that you can run tctl commands using your current credentials. tctl is supported on macOS and Linux machines. For example: teleport.example.com --user= [email protected] tsh login --proxy=--user= tctl status tctl status command, you can use your current credentials to run subsequent tctl commands from your workstation. If you host your own Teleport cluster, you can also run tctl commands on the computer that hosts the Teleport Auth Service for full permissions.

This section will show you how to rotate Teleport's certificate authority.

If you are joining Teleport processes to a cluster via the Teleport Auth Service using a join token, each Teleport process will need a CA pin to trust the Auth Service. The CA pin will change after each host CA rotation. Make sure you use the new CA pin when adding Teleport services after host CA rotation.

Exported CAs Teleport signs Windows Desktop client certificates with the user certificate authority. If the user CA is rotated, the new CA certificate must be exported and configured in group policy. Teleport signs self-hosted database host certificates with the db certificate authority and signs database client certificates with the db_client CA. If either of these CAs is rotated, then self-hosted databases must be reconfigured. Refer to the Database CA Rotation Guide.

The rotation consists of several phases:

standby : All operations have completed or haven't started yet.

: All operations have completed or haven't started yet. init : All components are notified of the rotation. A new certificate authority is issued, but not used. It is necessary for remote trusted clusters to fetch the new certificate authority, otherwise new clients will reject it.

: All components are notified of the rotation. A new certificate authority is issued, but not used. It is necessary for remote trusted clusters to fetch the new certificate authority, otherwise new clients will reject it. update_clients : Internal clients certs are updated and reloaded. Servers will use and respond with old credentials because clients have no idea about new certificates at first.

: Internal clients certs are updated and reloaded. Servers will use and respond with old credentials because clients have no idea about new certificates at first. update_servers : Servers reload and start serving TLS and SSH certificates signed by the new certificate authority, but will still accept certificates issued by the old certificate authority.

: Servers reload and start serving TLS and SSH certificates signed by the new certificate authority, but will still accept certificates issued by the old certificate authority. rollback : The rotation was aborted and is rolling back to the old certificate authority.

There are two kinds of certificate rotations:

Manual: it is the cluster administrator's responsibility to transition between each phase of the rotation while monitoring the state of the cluster. Manual rotations provide the greatest level of control, and are performed by providing the desired phase using the --phase flag with the tctl auth rotate command.

it is the cluster administrator's responsibility to transition between each phase of the rotation while monitoring the state of the cluster. Manual rotations provide the greatest level of control, and are performed by providing the desired phase using the flag with the command. Semi-automatic: Teleport automatically transitions between phases of the rotation after some amount of time (known as a grace period) elapses.

For both types of rotation, the cluster goes through the phases in the following order:

standby -> init -> update_clients -> update_servers -> standby

Administrators can abort the rotation and revert all changes any time before the rotation is completed by entering the rollback phase.

tctl auth rotate --manual --type= type --phase=rollback

For example, if an admin has detected that some nodes failed to upgrade during update_servers , they can roll back to the previous certificate authority, and the phase transitions look like this:

update_servers -> rollback -> standby .

info Try rotation/rollback in manual mode first to understand all the edge-cases and gotchas before going with semi-automatic version.

To specify which certificate authority to rotate, you must provide a value via the --type flag. If no value is provided, tctl will display an error and exit.

In manual mode, we manually transition between phases while monitoring the state of the cluster.

Start the rotation

Initiate the manual rotation of host certificate authorities:

tctl auth rotate --manual --type= type --phase=init Updated rotation phase to "init". To check status use 'tctl status'

Use tctl to confirm that there is an active rotation in progress:

tctl status

Check the status of connected nodes:

tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}' { "hostname": "terminal", "rotation": "in_progress", "phase": "init" }

In this example, the node named terminal has updated its status to phase init . This means it has downloaded a new CA public key and is ready for state transitions.

Rotation warning If some nodes are offline during rotation or have failed to update the status, you will lose connectivity after the transition update_servers -> standby . Make sure that all nodes are up to date with the transitions before proceeding.

Update clients

Execute the transition from init to update_clients :

tctl auth rotate --manual --type= type --phase=update_clients tctl status

note Clients will temporarily lose connectivity during Proxy and Auth Server restarts.

Verify that nodes have caught up and now see the current cluster state:

tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}' { "hostname": "terminal", "rotation": "in_progress", "phase": "update_clients" }

Update servers

Now that all nodes have caught up, execute the transition from update_clients to update_servers :

tctl auth rotate --manual --type= type --phase=update_servers

tctl status

note Usually if things go wrong, they go wrong at this transition. If you have lost connectivity to nodes, roll back to the old certificate authority.

Verify that nodes have caught up:

tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}' { "hostname": "terminal", "rotation": "in_progress", "phase": "update_servers" }

Finish the rotation

Before wrapping up, verify that you have not lost any nodes and can connect to them, for example:

tsh ssh hello@terminal

warning This is the last stage where you have the opportunity to roll back. If you have lost connectivity to nodes, roll back to the old certificate authority.

tctl auth rotate --manual --type= type --phase=standby

Verify that the rotation has completed with tctl :

tctl status Cluster acme.cluster Version 15.4.30 Host CA rotated Sep 20 2023 02:11:25 UTC User CA rotated Sep 20 2023 01:42:54 UTC Jwt CA rotated Sep 20 2023 01:42:54 UTC CA pin sha256:hash

Nodes should catch up and be on standby:

tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}' { "hostname": "terminal", "rotation": "standby", "phase": "standby" }

warning Semi-automatic rotation executes the same steps as the manual rotation, but with a grace period between them. It currently does not track the states of the nodes and you can lose connectivity if things go wrong.

You can trigger semi-automatic rotation by omitting the --manual and --phase flags.

tctl auth rotate --type=host

This will trigger a rotation process for hosts with a default grace period of 48 hours. During the grace period, certificates issued both by old and new certificate authority work.

You can customize grace period and CA type with additional flags:

tctl auth rotate --type=user --grace-period=200h

tctl auth rotate --type=host --grace-period=8h

The rotation takes time, especially for hosts, because each node in a cluster needs to be notified that a rotation is taking place and request a new certificate for itself before the grace period ends.

During semi-automatic rotations, Teleport will attempt to divide the grace period so that it spends an equal amount of time in each phase before transitioning to the next phase. This means that using a shorter grace period will result in faster state transitions.

warning Be careful when choosing a grace period when rotating host certificates.

The grace period needs to be long enough for all nodes in a cluster to request a new certificate. If some nodes go offline during the rotation and come back only after the grace period has ended, they will be forced to leave the cluster, i.e. users will no longer be allowed to SSH into them.

Check the cluster status:

tctl status Cluster acme.cluster Version 15.4.30 Host CA initialized (mode: manual, started: Sep 20 2023 01:44:36 UTC, ending: Sep 21 2023 07:44:36 UTC)

Check the status of individual nodes:

tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}' { "hostname": "terminal", "rotation": "in_progress", "phase": "init" }

The node named terminal has updated its status to phase init . This means it has downloaded a new CA public key and is ready for state transitions.

Rollback must be performed before the rotation enters standby state.

First, enter the rollback phase with a manual phase transition:

tctl auth rotate --phase=rollback --type= type --manual

Make sure that any nodes which have already updated have caught up and entered the rollback phase.

tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}' { "hostname": "terminal", "rotation": "in_progress", "phase": "rollback" }

If connectivity to any of the nodes was lost during the rotation, this is likely because they were still using the old cert authority. Connectivity to these nodes should be restored when the rollback completes and the old certificate authority is made active.

How the Teleport Certificate Authority works.