# AWS Multi-Region Proxy Deployment

This deployment architecture features two important design decisions:

- Amazon Route 53 latency-based routing is used to ensure that users and agents connect to the closest available proxy.
- Teleport's [Proxy Peering](https://goteleport.com/docs/ver/17.x/zero-trust-access/management/operations/proxy-peering.md) is used to reduce the total number of tunnel connections in the Teleport cluster.

This deployment architecture isn't recommended for use cases where your users or resources are clustered in a single region, or for Managed Service Providers needing to provide separate clusters to customers.

This architecture is best suited for globally distributed resources and end-users that prefer a single point of entry while also ensuring minimal latency when accessing connected resources.

This guide assumes that your infrastructure is **not** hosted in AWS GovCloud or in an air-gapped AWS environment (EKS Anywhere). If it is, [contact the Teleport team](https://goteleport.com/contact-us) for assistance planning your deployment.

## Key deployment components

- Deployed exclusively in the AWS ecosystem
- High-availability Auto Scaling group of Auth Service instances that must remain in a single region
- High-availability Auto Scaling group of Proxy Service instances deployed across multiple regions
- [Amazon Route 53 latency-based routing](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy-latency.html)
- [Teleport TLS Routing](https://goteleport.com/docs/ver/17.x/reference/architecture/tls-routing.md) to reduce the number of ports needed to use Teleport
- [Teleport Proxy Peering](https://goteleport.com/docs/ver/17.x/zero-trust-access/management/operations/proxy-peering.md) for reducing the number of resource connections
- [AWS Network Load Balancing](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html)
- [Amazon DynamoDB](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Introduction.html) for cluster state storage
- [AWS S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html) for session recording storage

## Advantages of this deployment architecture

- Eliminates the complexity and cost of maintaining multiple Teleport clusters across multiple regions.
- Uses the lowest-latency path to connect users to resources.
- Provides a highly resilient, redundant HA architecture for Teleport that can quickly scale with an organization's needs.
- All required Teleport components can be provisioned within the AWS ecosystem.
- Using load balancers for the Proxy Service and Auth Service allows for increased availability during Teleport cluster upgrades.

## Disadvantages of this deployment architecture

- When Teleport Auth Service instances are limited to a single region, there is a higher likelihood of decreased availability during an AWS regional outage.
- More technically complex to deploy than a single region Teleport cluster.

![Diagram showing this Teleport
architecture](/docs/assets/images/aws-gslb-proxy-peering-ha-deployment-0fa9e5baaab25cebaba194481f17af34.png)

## AWS Network Load Balancer (NLB)

AWS NLBs are required for this highly available deployment architecture. The NLB forwards traffic from users and services to an available Teleport Proxy Service instance. This must not terminate TLS, and must transparently forward the TCP traffic it receives. In other words, this must be a Layer 4 load balancer, not a Layer 7 (e.g., HTTP) load balancer.

---

NOTE

Cross-zone load balancing is required for the Auth Service and Proxy Service NLB configurations to route traffic across multiple zones. Doing this improves resiliency against localized AWS zone outages.

---

### Configure the Proxy Service NLBs

Configure the load balancer to forward traffic from the following ports on the load balancer to the corresponding port on an available Teleport instance.

| Port  | Description                                                                                                               |
| ----- | ------------------------------------------------------------------------------------------------------------------------- |
| `443` | ALPN port for TLS Routing, HTTPS connections to authenticate `tsh` users into the cluster, and to serve Teleport's Web UI |

### Configure the Auth Service NLB

Configure the load balancer to forward traffic from the following ports on the load balancer to the corresponding port on an available Teleport instance.

---

NOTE

Proxies must have network access to the Auth Service NLB. You can accomplish this using [VPC Peering](https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html) or [Transit Gateways](https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html).

---

Internal NLB Auth Service ports

| Port   | Description                                                                |
| ------ | -------------------------------------------------------------------------- |
| `3025` | TLS port used by the Auth Service to serve its API to Proxies in a cluster |

## TLS credential provisioning

High-availability Teleport deployments require a system to fetch TLS credentials from a certificate authority like Let's Encrypt, AWS Certificate Manager, Digicert, or a trusted internal authority. The system must then provision Teleport Proxy Service instances with these credentials and renew them periodically.

For high-availability deployments that use Let's Encrypt to supply TLS credentials to Teleport instances running behind a load balancer, you need to use the [ACME DNS-01](https://letsencrypt.org/docs/challenge-types/#dns-01-challenge) challenge to demonstrate domain name ownership to Let's Encrypt. In this challenge, your TLS credential provisioning system creates a DNS TXT record with a value expected by Let's Encrypt.

## Latency-based routing with Route 53

[Latency-based routing](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy-latency.html) in a public hosted zone must be used to ensure traffic from Teleport resources are routed to the closest or lowest latency path Proxy NLB based on the region of the VPC the resource is connecting from.

To configure this behavior, create a CNAME record for each region where you have VPCs containing Teleport-connected resources. It is recommended to add a wildcard record for every region if you plan to register applications with Teleport.

The following CNAME record values need to be set:

- **Value:** The domain name of the NLB where `example-region-1` located Teleport resource traffic should be routed
- **Routing policy:** Latency
- **Region:** The AWS region from which traffic should be routed to the NLB listed in **Value**
- **Health Check ID:** It is recommended that you set this so that traffic is always routed to a healthy NLB

Example Hosted Zone using AWS Route53 Latency Routing:

### Root record for Teleport

| Record name              | Type  | Value                    |
| ------------------------ | ----- | ------------------------ |
| `*.teleport.example.com` | CNAME | AWS Route 53 nameservers |

### Teleport Proxy regional DNS records

| Record name              | Type  | Routing Policy | Region       | Value                            |
| ------------------------ | ----- | -------------- | ------------ | -------------------------------- |
| `teleport.example.com`   | CNAME | Latency        | us-west-1    | `elb.us-west-1.amazonaws.com`    |
| `*.teleport.example.com` | CNAME | Latency        | us-west-1    | `elb.us-west-1.amazonaws.com`    |
| `teleport.example.com`   | CNAME | Latency        | eu-central-1 | `elb.eu_central-1.amazonaws.com` |
| `*.teleport.example.com` | CNAME | Latency        | eu-central-1 | `elb.eu_central-1.amazonaws.com` |

---

REQUIRED PERMISSIONS

If you are using Let's Encrypt to provide TLS credentials to your Teleport instances, the TLS credential system we mentioned earlier needs permissions to manage Route53 DNS records in order to satisfy Let's Encrypt's DNS-01 challenge.

---

### Teleport resource agent configuration

To facilitate latency-based routing, resource agents must be configured to point `proxy_server` to the CNAME configured in Route53, **not** the specific proxy NLB address.

For example:

```
version: v3
teleport:
    nodename: ssh-node
    ...
    proxy_server: teleport.example.com:443
    ...
    ssh_service:
        enabled: true
    ...

```

Review the [configuration reference](https://goteleport.com/docs/ver/17.x/reference/config.md) page for additional settings.

## Configure Proxy Peering

In this deployment architecture, [Proxy Peering](https://goteleport.com/docs/ver/17.x/zero-trust-access/management/operations/proxy-peering.md) is used to restrict the number of connections made from resources to proxies in the Teleport Cluster.

### Auth Service Proxy Peering configuration

The Teleport Auth Service must be configured to use the `proxy_peering` tunnel strategy as shown in the example below:

```
auth_service:
 ...
 tunnel_strategy:
  type: proxy_peering
  agent_connection_count: 2

```

Reference the [Auth Service configuration](https://goteleport.com/docs/ver/17.x/reference/config.md) reference page for additional settings.

### Proxy Service Proxy Peering configuration

Proxies must advertise a peer address for proxy peers to establish connections to each other. The ports exposed on the Teleport Proxy Instances depends on whether you route Proxy Peering traffic over the public internet:

**Public Proxy Peering ports**

| Port   | Description                                                                                                               |
| ------ | ------------------------------------------------------------------------------------------------------------------------- |
| `443`  | ALPN port for TLS Routing, HTTPS connections to authenticate `tsh` users into the cluster, and to serve Teleport's Web UI |
| `3021` | Proxy Peering gRPC Stream                                                                                                 |

**VPC peering Proxy Peering ports**

| Port  | Description                                                                                                               |
| ----- | ------------------------------------------------------------------------------------------------------------------------- |
| `443` | ALPN port for TLS Routing, HTTPS connections to authenticate `tsh` users into the cluster, and to serve Teleport's Web UI |

Set `peer_public_addr` to the specific name of that proxy. This is the recommended method for lowest latency and most reliable connection.

```
version: v3
teleport:
...
proxy_service:
  ...
  peer_public_addr: teleport-proxy-eu-west-1-host1.example.com:3021
  ...

```

---

NOTE

`agent_connection_count` on the Auth service should be set to a value >=2 to decrease the likelihood of agents being unavailable.

---

Reference the [Proxy Service configuration](https://goteleport.com/docs/ver/17.x/reference/config.md) reference page for additional settings.
