
Prerequisites
-
A running Teleport cluster. For details on how to set this up, see one of our Getting Started guides.
-
The
tctl
admin tool andtsh
client tool version >= 12.1.1.tctl versionTeleport v12.1.1 go1.19
tsh versionTeleport v12.1.1 go1.19
See Installation for details.
-
A running Teleport cluster. For details on how to set this up, see our Enterprise Getting Started guide.
-
The Enterprise
tctl
admin tool andtsh
client tool version >= 12.1.1, which you can download by visiting the customer portal.tctl versionTeleport Enterprise v12.1.1 go1.19
tsh versionTeleport v12.1.1 go1.19
Please use the latest version of Teleport Enterprise documentation.
To connect to Teleport, log in to your cluster using tsh
, then use tctl
remotely:
tsh login --proxy=teleport.example.com [email protected]tctl statusCluster teleport.example.com
Version 12.1.1
CA pin sha256:abdc1245efgh5678abdc1245efgh5678abdc1245efgh5678abdc1245efgh5678
You can run subsequent tctl
commands in this guide on your local machine.
For full privileges, you can also run tctl
commands on your Auth Service host.
To connect to Teleport, log in to your cluster using tsh
, then use tctl
remotely:
tsh login --proxy=myinstance.teleport.sh [email protected]tctl statusCluster myinstance.teleport.sh
Version 12.1.1
CA pin sha256:sha-hash-here
You must run subsequent tctl
commands in this guide on your local machine.
Certificate Authority rotation
This section will show you how to implement certificate rotation in practice.
If you are using CA Pinning when adding new nodes, the CA pin will change after the rotation. Make sure you use the new CA pin when adding nodes after rotation.
Teleport signs Windows Desktop certificates with the user certificate authority. If the user CA is rotated, the new CA certificate will need to be exported and configured in group policy.
Prior to Teleport 12, --type
would default to rotating all certificate authorities.
It is best to rotate CAs one at a time for increased stability. Future versions of
Teleport will require this, but you can opt in to the previous behavior by adding
an explicit --type=all
flag.
Rotation phases
The rotation consists of several phases:
standby
: All operations have completed or haven't started yet.init
: All components are notified of the rotation. A new certificate authority is issued, but not used. It is necessary for remote trusted clusters to fetch the new certificate authority, otherwise new clients will reject it.update_clients
: Internal clients certs are updated and reloaded. Servers will use and respond with old credentials because clients have no idea about new certificates at first.update_servers
: Servers reload and start serving TLS and SSH certificates signed by the new certificate authority, but will still accept certificates issued by the old certificate authority.rollback
: The rotation was aborted and is rolling back to the old certificate authority.
Rotation types
There are two kinds of certificate rotations:
- Manual: it is the cluster administrator's responsibility to transition
between each phase of the rotation while monitoring the state of the cluster.
Manual rotations provide the greatest level of control, and are performed by
providing the desired phase using the
--phase
flag with thetctl auth rotate
command. - Semi-automatic: Teleport automatically transitions between phases of the rotation after some amount of time (known as a grace period) elapses.
For both types of rotations, the cluster goes through the phases in the following order:
standby
->init
->update_clients
->update_servers
->standby
Administrators can abort the rotation and revert all changes any time before
the rotation is completed by entering the rollback
phase.
tctl auth rotate --phase=rollback --type=type --manual
For example, if an admin has detected that some nodes failed to upgrade during
update_servers
, they can roll back to the previous certificate authority, and
the phase transitions look like this:
update_servers
->rollback
->standby
.
Try rotation/rollback in manual mode first to understand all the edge-cases and gotchas before going with semi-automatic version.
To specify which certificate authority to rotate, you must provide a value via
the --type
flag. If no value is provided, tctl
will display an error and exit. To replicate the functionality of versions prior to 12, where
all certificate authorities were rotated by default, you can pass in
--type=all
. Keep in mind that this functionality is deprecated and will be removed
in a future version.
Manual rotation
In manual mode, we manually transition between phases while monitoring the state of the cluster.
Start the rotation
Initiate the manual rotation of host certificate authorities:
tctl auth rotate --phase=init --type=type --manualUpdated rotation phase to "init". To check status use 'tctl status'
Use tctl
to confirm that there is an active rotation in progress:
tctl statusCluster acme.cluster
Version 12.1.1
Host CA initialized (mode: manual, started: Sep 20 01:44:36 UTC, ending: Sep 21 07:44:36 UTC)
User CA rotated Sep 20 01:42:54 UTC
Jwt CA rotated Sep 20 01:42:54 UTC
CA pin sha256:hash
Check the status of connected nodes:
Check rotation status of the nodes
tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}'{
"hostname": "terminal",
"rotation": "in_progress",
"phase": "init"
}
In this example, the node named terminal
has updated its status to phase
init
. This means it has downloaded a new CA public key and is ready for state
transitions.
If some nodes are offline during rotation or have failed to update the status,
you will lose connectivity after the transition update_servers
-> standby
.
Make sure that all nodes are up to date with the transitions before
proceeding.
Update clients
Execute the transition from init
to update_clients
:
tctl auth rotate --phase=update_clients --type=type --manualUpdated rotation phase to "update_clients". To check status use 'tctl status'
tctl statusCluster acme.cluster
Version 12.1.1
Host CA rotating clients (mode: manual, started: Sep 20 01:44:36 UTC, ending: Sep 21 07:44:36 UTC)
Clients will temporarily lose connectivity during Proxy and Auth Server restarts.
Verify that nodes have caught up and now see the current cluster state:
tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}'{
"hostname": "terminal",
"rotation": "in_progress",
"phase": "update_clients"
}
Update servers
Now that all nodes have caught up, execute the transition from update_clients
to update_servers
:
tctl auth rotate --phase=update_servers --type=type --manualUpdated rotation phase to "update_servers". To check status use 'tctl status'
tctl statusCluster acme.cluster
Version 12.1.1
Host CA rotating servers (mode: manual, started: Sep 20 01:44:36 UTC, ending: Sep 21 07:44:36 UTC)
Usually if things go wrong, they go wrong at this transition. If you have lost connectivity to nodes, roll back to the old certificate authority.
Verify that nodes have caught up:
tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}'{
"hostname": "terminal",
"rotation": "in_progress",
"phase": "update_servers"
}
Finish the rotation
Before wrapping up, verify that you have not lost any nodes and can connect to them, for example:
tsh ssh [email protected]
This is the last stage where you have the opportunity to roll back. If you have lost connectivity to nodes, roll back to the old certificate authority.
tctl auth rotate --phase=standby --type=type --manual
Verify that the rotation has completed with tctl
:
tctl statusCluster acme.cluster
Version 12.1.1
Host CA rotated Sep 20 02:11:25 UTC
User CA rotated Sep 20 01:42:54 UTC
Jwt CA rotated Sep 20 01:42:54 UTC
CA pin sha256:hash
Nodes should catch up and be on standby:
tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}'{
"hostname": "terminal",
"rotation": "standby",
"phase": "standby"
}
Semi-Automatic rotation
Semi-automatic rotation executes the same steps as the manual rotation, but with a grace period between them. It currently does not track the states of the nodes and you can lose connectivity if things go wrong.
You can trigger semi-automatic rotation by omitting the --manual
and --phase
flags.
tctl auth rotate --type=host
This will trigger a rotation process for hosts with a default grace period of 48 hours. During the grace period, certificates issued both by old and new certificate authority work.
You can customize grace period and CA type with additional flags:
Rotate only user certificates with a grace period of 200 hours:
tctl auth rotate --type=user --grace-period=200hRotate only host certificates with a grace period of 8 hours:
tctl auth rotate --type=host --grace-period=8h
The rotation takes time, especially for hosts, because each node in a cluster needs to be notified that a rotation is taking place and request a new certificate for itself before the grace period ends.
During semi-automatic rotations, Teleport will attempt to divide the grace period so that it spends an equal amount of time in each phase before transitioning to the next phase. This means that using a shorter grace period will result in faster state transitions.
Be careful when choosing a grace period when rotating host certificates.
The grace period needs to be long enough for all nodes in a cluster to request a new certificate. If some nodes go offline during the rotation and come back only after the grace period has ended, they will be forced to leave the cluster, i.e. users will no longer be allowed to SSH into them.
Check the cluster status:
tctl statusCluster acme.cluster
Version 12.1.1
Host CA initialized (mode: manual, started: Sep 20 01:44:36 UTC, ending: Sep 21 07:44:36 UTC)
Check the status of individual nodes:
Check rotation status of the nodes
tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}'{
"hostname": "terminal",
"rotation": "in_progress",
"phase": "init"
}
The node named terminal
has updated its status to phase init
. This means it
has downloaded a new CA public key and is ready for state transitions.
Rollback
Rollback must be performed before the rotation enters standby
state.
First, enter the rollback phase with a manual phase transition:
tctl auth rotate --phase=rollback --type=type --manualUpdated rotation phase to "rollback". To check status use 'tctl status'
Make sure that any nodes which have already updated have caught up and entered
the rollback
phase.
Check rotation status of the nodes
tctl get nodes --format=json | jq '.[] | {hostname: .spec.hostname, rotation: .spec.rotation.state, phase: .spec.rotation.phase}'{
"hostname": "terminal",
"rotation": "in_progress",
"phase": "rollback"
}
If connectivity to any of the nodes was lost during the rotation, this is likely because they were still using the old cert authority. Connectivity to these nodes should be restored when the rollback completes and the old certificate authority is made active.
Further reading
How the Teleport Certificate Authority works.