Version: 19.x (unreleased)

High Availability for Teleport Agents

Is this page helpful?

You can run multiple Teleport Agents that proxy the same infrastructure resources for high availability. If one Teleport Agent goes offline, Teleport users can still connect to the infrastructure resources that the agent was configured to proxy.

This guide explains how agent high availability works and how to configure it for your organization. Since you must maintain your own highly available agent deployments, this guide provides architectural context so you can understand how such a deployment functions.

There are four Teleport Agent services that support highly available deployments:

Teleport Application Service
Teleport Database Service
Teleport Desktop Service
Teleport Kubernetes Service

tip

As a general rule, if two agents connected to the Teleport Proxy Service have the same configuration, those agents will proxy the same infrastructure resources, and the Teleport Proxy Service will load balance user traffic between them.

How it works

When the Teleport Proxy Service receives traffic to a Teleport-protected resource, it finds an available Teleport Agent that can proxy the resource and forwards the traffic to it.

Agent heartbeats

Each Teleport Agent sends periodic heartbeat messages to the Teleport Proxy Service for each infrastructure resource that the agent proxies. The Teleport Proxy Service uses heartbeats to assemble a continuously updated list of Teleport-protected resources in which each resource is associated with an agent.

Since an agent sends heartbeats for each registered resource, if multiple agents proxy the same resource, the Proxy Service maintains multiple records of resource-agent combinations.

For example, if an agent in us-east-1a and an agent in us-east-1b are both proxying an application called myapp, the Proxy Service receives separate heartbeats, and maintains separate records, for myapp in us-east-1a and myapp in us-east-1b.

Proxy Service load balancing

When the Teleport Proxy Service receives traffic to a Teleport-protected resource, it determines whether the traffic belongs to an existing session. If it does, the Proxy Service forwards the traffic to the Teleport Agent associated with that session. Otherwise, the Proxy Service looks up a list of Teleport Agents configured to proxy the target resource (based on the heartbeats described in the previous section), creates a new session, and associates it with a random healthy agent in the list.

Because Teleport keeps track of sessions for target infrastructure, Teleport Agents are not stateless, and losing an Agent that is proxying user connections means that the users will need to establish new sessions.

User experience

tsh, the Web UI, and Teleport Connect list a single instance of each Teleport-protected resource with a given name, meaning that end users do not need to know how many Teleport Agents proxy a certain resource.

This means that, if a user wants to access an infrastructure resource proxied by multiple agents, they can continue to have the same experience when one of the agents becomes unavailable (with a possible delay in order to establish a new session).

Configuring proxied resources

Teleport Agent configuration files have two ways to instruct an agent to proxy infrastructure resources:

Static resource configurations: A list of configured infrastructure resources for the agent to proxy.
Dynamic resource watchers: A list of filters the agent uses to fetch dynamic resources from the Teleport Auth Service that represent applications, databases, Kubernetes clusters, and remote desktops. The agent proxies infrastructure resources that match its filters.

When an agent boots up, it starts sending heartbeat messages for each static resource configuration. It also starts its dynamic resource watchers, fetches dynamic resource configurations that match them, and starts sending heartbeats for each matching resource.

Static resource configurations

In a static resource configuration, all information an agent needs to proxy an infrastructure resource is in the configuration file it reads when it first starts. If there are multiple agents proxying an infrastructure resource with the same name, the Proxy Service load balances user traffic between them:

Applications
Databases
Kubernetes Clusters
Desktops

You can configure multiple instances of the Teleport Application Service to proxy an application with the same name:

# Same config for all agents in the pool.
app_service:
  enabled: true
  apps:
  - name: "myapp"
    uri: "example.com"

You can configure multiple instances of the Teleport Database Service to proxy a database with the same name:

# Same config for all agents in the pool.
db_service:
  enabled: true
  databases:
  - name: "postgres"
    protocol: "postgres"
    uri: "postgres.example.com:5432"

You can configure multiple instances of the Teleport Kubernetes Service to proxy a Kubernetes cluster with the same kube_cluster_name:

# Same config for all agents in the pool.
kubernetes_service:
  enabled: true
  # Include the same kubeconfig for all agents.
  kubeconfig_file: /secrets/kubeconfig
  kube_cluster_name: mycluster

You can configure multiple instances of the Teleport Desktop Service to proxy a desktop with the same name:

windows_desktop_service:
  enabled: true
  static_hosts:
    - name: example1
      ad: false
      addr: win1.dev.example.com

Note that when using the Teleport Desktop Service's built-in discovery capability, the service names discovered desktops automatically based on their hostnames.

Choosing an agent replica to connect to

With separate replicas, each instance of an agent service proxying a given infrastructure resource has a different name. This allows you to explicitly pick the agent you want to connect to the resource over. Consider this example, in which two Teleport Database Service instances proxy the same database.

On the first Database Service instance, the database has the name postgres-us-east-1a:

# Database service instance #1.
db_service:
  enabled: true
  databases:
  # Note the name is different than instance #2 but the URI is the same.
  - name: "postgres-us-east-1a"
    protocol: "postgres"
    uri: "postgres.example.com:5432"

In the second instance, the configured database is the same but the agent configuration gives it a different name:

# Database service instance #2.
db_service:
  enabled: true
  databases:
  # Note the name is different than instance #1 but the URI is the same.
  - name: "postgres-us-east-1b"
    protocol: "postgres"
    uri: "postgres.example.com:5432"

With this configuration, both services will appear as two separate entries in tsh db ls output and you will have to pick one explicitly when connecting:

tsh db ls
Name
-------------------
postgres-us-east-1a
postgres-us-east-1b

tsh db connect postgres-us-east-1a

This approach is useful when you want to have control over which replica you're using to connect.

Dynamic resource watchers

When an agent loads a dynamic resource watcher, it fetches dynamic resources from the Teleport Auth Service that represent infrastructure resources to proxy, filtering them to match a set of configured rules.

For example, the Teleport Application Service fetches app resources as long as they include certain labels.

As with all dynamic resources, those that represent infrastructure include a metadata.name field. If two infrastructure resources have the same name, the Teleport Proxy Service load balances user traffic between them.

Select a resource type to view an example configuration of its dynamic resource watchers:

Applications
Databases
Kubernetes Clusters
Desktops

To configure the Teleport Application Service to watch for app resources, add a labels field to its resources configuration:

app_service:
  enabled: true
  resources:
  - labels:
      "region": "us-east-1"

To configure the Teleport Database Service to watch for db resources, add a labels field to its resources configuration:

db_service:
  enabled: true
  resources:
  - labels:
      "region": "us-east-1"

To configure the Teleport Kubernetes Service to watch for kube_cluster dynamic resources, add a labels field to its resources configuration:

kubernetes_service:
  enabled: true
  resources:
  - labels:
      "region": "us-east-1"

To configure the Windows Desktop Service to watch for dynamic_windows_desktop resources, add a labels field to its resources configuration:

windows_desktop_service:
  enabled: true
  resources:
  - labels:
      "region": "us-east-1"

Next steps

Dynamic resource watchers enable you to configure high availability for Teleport Agents without needing to know the names of any Teleport-protected resources in advance.

Teleport auto-discovery enables you to enroll infrastructure resources with Teleport as they come online. Since resource names are automatically populated, high availability is already enabled as long as there are at least two agents with an appropriate dynamic resource watcher. Get started with auto-discovery.

Was this page helpful?

Teleport for AI Infrastructure

Teleport for Access & Governance

Teleport for Identity Security

Industries

Featured Resource

Compliance

Featured Resource

Strategic Partners

Featured AWS Webinar

Read

Experience

Discover

Featured Blog Post

Company

Explore

Featured Event

Teleport Partners

From Our Partners

High Availability for Teleport Agents

How it works

Agent heartbeats

Proxy Service load balancing

User experience

Configuring proxied resources

Static resource configurations

Dynamic resource watchers

Next steps

Teleport for AI Infrastructure

Teleport for Access & Governance

Teleport for Identity Security

Industries

Featured Resource

Compliance

Featured Resource

Strategic Partners

Featured AWS Webinar

Featured Blog Post

Featured Event

From Our Partners

How it works​

Agent heartbeats​

Proxy Service load balancing​

User experience​

Configuring proxied resources​

Static resource configurations​

Dynamic resource watchers​

Next steps​

How it works

Agent heartbeats

Proxy Service load balancing

User experience

Configuring proxied resources

Static resource configurations

Dynamic resource watchers

Next steps