TLS Routing Support for Teleport Behind an AWS Application Load Balancer

Nov 18, 2022 by 

Steve Huang

tls routing

In Teleport 8, we introduced the TLS Routing feature that can multiplex all client connections on a single TLS/SSL port.

Recently we've added support for TLS Routing for Database Access when Teleport is deployed behind an AWS Application Load Balancer (ALB). In this article, we will take a deep look at the problem with Teleport behind an ALB and how we solved it.

What is an AWS Application Load Balancer?

By Wikipedia definition, load balancing is "the process of distributing a set of tasks over a set of resources (computing units)".

AWS offers a few types of Elastic Load Balancers. An AWS Application Load Balancer, as the name suggests, functions at the application layer, the 7th layer of the Open Systems Interconnection (OSI) model. ALBs can make routing decisions based on the host, path, headers, etc. of an HTTP request, and can terminate HTTPS traffic with TLS certificates managed by AWS.

Application Load Balancer vs Network Load Balancer

A Network Load Balancer (NLB), on the other hand, functions at the 4th layer of the OSI model. Hence, for load balancing UDP or non-HTTP TCP traffic, NLB is the clear choice.

How about HTTP traffic, which is technically TCP? Well, I would say ALBs are usually preferred since they have visibility into the HTTP requests, so you can make HTTP-based routing rules or inspect certain aspects of the HTTP requests in the ALB access logs. That being said, there are situations that you would choose a NLB over an ALB, like when latency performance is critical, or when static IPs are required for the load balancer, etc.

Teleport behind a load balancer

A load balancer is usually used to achieve High Availability (HA), and Teleport is no exception. But Teleport is complicated. It runs more than just HTTP.

Naturally, one would just put Teleport behind a NLB and call it a day. Well, that works, if only TCP listeners are used. If you need the NLB to terminate TLS with AWS-managed certificates, you are out of luck.

For starters, TLS listeners for NLBs do not support mutual TLS (mTLS). And when it comes to Application-Layer Protocol Negotiation (ALPN) support, only http1.1 and h2 can be served. What is worse, Server Name Indication (SNI) and ALPNs are all stripped out from the Client Hello when the load balancer performs the TLS handshake with TLS targets so the info is lost. Neither does the load balancer validate the certificates or server name of the TLS targets during the handshake.

ALBs are also found to have the same problems. And likely some of these restrictions also apply to other load balancer and reverse proxy implementations.

Then why are we even talking about ALBs? Well in a typical Teleport setup today, an ALB can be deployed alongside of an NLB (with TCP listeners), so at the minimum HTTPS traffic like Teleport's webapp and web APIs can be terminated by the ALB with AWS-managed certificates. Let's see if we can find a general solution to make the best use of it.

WebSockets

Ok, Teleport uses fancy TLS. Can we do that over HTTP? Or even better, is it possible to carry an arbitrary network protocol over HTTP?

Wait a minute. Isn't there one such protocol that ALBs natively support?

WebSockets!

How do they work?

The WebSocket connection starts with a HTTP connection upgrade request that features a couple of headers:

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13

And if the server is happy about it, it returns:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat

Once the upgrade connection is established, the client and server can exchange WebSocket data frames in full-duplex mode.

This is great! Let's see if we can generalize this process for a different protocol.

Customizing the connection upgrade

So here is the plan. Instead of an Upgrade for websocket, we will use a custom upgrade type between our server and client. In our TLS Routing case, we named it alpn but it's really an arbitrary name.

Once the upgrade handshake is done, the client will initiate a TLS handshake to start the TLS Routing flow, inside the upgraded connection. And yes, now this TLS handshake can have all the goodies like ALPN, SNI, client certificates, etc.

Sequence Diagram

It is important to note that what's encapsulated inside the upgraded connection is exactly what the client would normally send to the Teleport Proxy server, when there is no such ALB in the middle! Thus there is minimal change required for the original TLS Routing implementation. All that's needed is to bootstrap it with the custom connection upgrade.

Detecting the ALB

Now we know how to get through the ALB, but we also need a way to detect if an ALB is present to avoid the overhead of the connection upgrade when not necessary.

We do this by testing a custom (non-HTTP) ALPN in a TLS handshake with the remote server the client is connecting to.

// IsALPNConnUpgradeRequired returns true if a tunnel is required through a HTTP
// connection upgrade for ALPN connections.
func IsALPNConnUpgradeRequired(addr string, insecure bool) bool {
	netDialer := &net.Dialer{
		Timeout: defaults.DefaultDialTimeout,
	}
	tlsConfig := &tls.Config{
		NextProtos:         []string{"custom-alpn"},
		InsecureSkipVerify: insecure,
	}
	testConn, err := tls.DialWithDialer(netDialer, "tcp", addr, tlsConfig)
	if err != nil {
		// If dialing TLS fails for any reason, we assume connection upgrade is
		// not required so it will fallback to original connection method.
		return false
	}
	defer testConn.Close()

	// Upgrade required when ALPN is not supported on the remote side so
	// NegotiatedProtocol comes back as empty.
	result := testConn.ConnectionState().NegotiatedProtocol == ""
	return result
}

It is found that an ALB will complete this TLS handshake with no negotiated protocol. That's when we know we can proceed to do a connection upgrade.

As an ALB is usually stationary, there is no need to repeat the test on every single connection. We currently test once per client session, which should more than suffice.

Summary

In this article, we have explored a method to use a custom connection upgrade to tunnel an existing network protocol over HTTP.

For Teleport starting v10.2.4, the tsh proxy db client command will perform a custom connection upgrade to enable TLS Routing for database protocols, when ALB is detected. And we will surely look into porting this solution to other Teleport features like Server Access, Kubernetes Access, Application Access, and more in the future.

Try Teleport today

In the cloud, self-hosted, or open source
Get StartedView developer docs