Skip to main content

Migrating to teleport-cluster v12

This guide covers the major changes of the teleport-cluster v12 chart and how to upgrade existing releases from version 11 to version 12.

Changes summary

The main changes in version 12 of the teleport-cluster chart are:

  • PodSecurityPolicy has been removed on Kubernetes 1.23 and 1.24
  • Teleport now deploys its Auth and Proxy Services as separate pods. Running Teleport with this new topology allows it to be more resilient to disruptions and scale better.
  • Proxies are now deployed as stateless workloads. The proxy session recording mode uploads recordings asynchronously. Non-uploaded records might be lost during rollouts (config changes or version upgrades for example). proxy-sync ensures consistency and does not have this limitation.
  • custom mode has been removed as it was broken by the topology change. It is replaced by a new configuration override mechanism allowing you to pass arbitrary Teleport configuration values.
  • The values standalone.* that were previously deprecated in favor of persistence have been removed.
  • The chart can now be scaled up in standalone mode. Proxy replication requires a TLS certificate; Auth replication requires using HA storage backends.
Version compatibility

The chart has always been versioned with Teleport but was often compatible with the previous Teleport major version. This is not the case for v12. Using the chart v12 requires at least Teleport v12.

How to upgrade

If you are running Kubernetes 1.23 and above, follow our Kubernetes 1.25 PSP removal guide.

Then, the upgrade path mainly depends on the chartMode used. If you used a "managed" mode like aws, gcp or standalone it should be relatively straightforward. If you relied on the custom chart mode, you will have to perform configuration changes.

Before upgrading, always:

warning

During the upgrade, Kubernetes will delete existing deployments and create new ones. This is not seamless and will cause some downtime until the new pods are up and all health checks are passing. This usually takes around 5 minutes.

If you use gcp, aws or standalone mode

The upgrade should not require configuration changes. Make sure you don't rely on standalone.* for storage configuration (if you do, switch to using persistence values instead).

Upgrading to v12 will increase the amount of pods deployed as it will deploy auth and proxies separately. The chart will try to deploy multiple proxy replicas when possible (proxies can be replicated if certs are provided through a secret or cert-manager). Make sure you have enough room in your Kubernetes cluster to run the additional Teleport pods:

  • aws and gcp will deploy twice the amount of pods
  • standalone will deploy 1 or 2 additional pods (depending if the proxy can be replicated)

The additional pods might take more time than before to deploy and become ready. If you are running helm with --wait or --atomic make sure to increase your timeouts to at least 10 minutes.

If you use custom mode

The custom mode worked by passing the Teleport configuration through a ConfigMap. Due to the version 12 topology change, existing custom configuration won't work as-is and will need to be split in two separate configurations: one for the proxies and one for the auths.

To avoid a surprise breaking upgrade, the teleport-cluster v12 chart will refuse to deploy in custom mode and point you to this migration guide.

Version 12 has introduced a new way to pass arbitrary configuration to Teleport without having to write a full configuration file. If you were using custom mode because of a missing chart feature (like etcd backend support for example) this might be a better fit for you than managing a fully-custom config.

If you deploy a Teleport cluster

You can now use the existing modes aws, gcp and standalone and pass your custom configuration overrides through the auth.teleportConfig and proxy.teleportConfig values. For most use-cases this is the recommended setup as you will automatically benefit from future configuration upgrades.

You must split the configuration in two configurations, one for each node type:

  • The proxy configuration must contain at least the proxy_service section and the teleport section without the storage part.
  • The auth configuration must contain at least the auth_service and teleport sections.

For example - a v11 custom configuration that looked like this:

teleport:
log:
output: stderr
severity: INFO
auth_service:
enabled: true
cluster_name: custom.example.com
tokens: # This is custom configuration
- "proxy,node:abcd123-insecure-do-not-use-this"
- "trusted_cluster:efgh456-insecure-do-not-use-this"
listen_addr: 0.0.0.0:3025
public_addr: custom.example.com:3025
session_recording: node-sync
proxy_service:
enabled: true
listen_addr: 0.0.0.0:3080
public_addr: custom.example.com:443
ssh_public_addr: ssh-custom.example.com:3023 # This is custom configuration

Can be converted into these values:

chartMode: standalone
clusterName: custom.example.com

sessionRecording: node-sync

auth:
teleportConfig:
auth_service:
tokens:
- "proxy,node:abcd123-insecure-do-not-use-this"
- "trusted_cluster:efgh456-insecure-do-not-use-this"

proxy:
teleportConfig:
proxy_service:
ssh_public_addr: ssh-custom.example.com:3023
warning

teleport.cluster_name and teleport.auth_service.authentication.webauthn.rp_id MUST NOT change.

If you deploy Teleport nodes

If you used the teleport-cluster chart in custom mode to deploy only services like app_service, db_service, kube_service, windows_service or discovery_service, you should use the teleport-kube-agent chart for this purpose.

The chart offers values to configure app_service, kube_service and db_service, but other services can be configured through the teleportConfig value.

To migrate to the teleport-kube-agent chart from teleport-cluster, use the following values:

proxyAddr: teleport.example.com
# pass the token through joinParams instead of `teleportConfig` so it lives
# in a Kubernetes Secret instead of a ConfigMap
joinParams:
method: token
tokenName: abcd123-insecure-do-not-use-this

# Roles can be empty if you pass all the configuration through `teleportConfig`
roles: ""

# Put all your previous `teleport.yaml` values except the `teleport` section below
teleportConfig:
# kubernetes_service:
# enabled: true
# [...]
# discovery_service:
# enabled: true
# [...]

Going further

The new topology allows you to replicate the proxies to increase availability. You might also want to tune settings like Kubernetes resources or affinities.

By default, each value applies to both proxy and auth pods, e.g.:

resources:
requests:
cpu: "1"
memory: "2GiB"
limits:
cpu: "1"
memory: "2GiB"

highAvailability:
requireAntiAffinity: true

But you can scope the value to a specific pod set by nesting it under the proxy or auth values. If both the value at the root and a set-specific value are set, the specific value takes precedence:

# By default, all pods use those resources
resources:
requests:
cpu: "1"
memory: "2GiB"
limits:
cpu: "1"
memory: "2GiB"

proxy:
# But the proxy pods have have different resource requests and no cpu limits
resources:
requests:
cpu: "0.5"
memory: "1GiB"
limits:
cpu: ~ # Generic and specific config are merged: if you want to unset a value, you must do it explicitly
memory: "1GiB"

auth:
# Only auth pods will require an anti-affinity
highAvailability:
requireAntiAffinity: true