Caching Servers

For the caching servers in the front layer of your server architecture you need have a clear understanding of TCP connection scalability issues.

The first thing you may notice as the load on your system increases, is that the cache server process runs out of file handles (unless its start script increases the right kernel parameter). This is because the operating system uses one file handle for each connection, and on many systems the default number of handles a single user process is allowed to create is 1024. This problem can be temporarily fixed with the ulimit -u command (on Linux and FreeBSD). To fix it more permanently you need to edit /etc/sysctl.conf (on Linux and FreeBSD) or /etc/system (on Solaris). You can set the maximum number of file handles up to several hundred thousand, so there is no real limitation here.

The operating system set up TCP connections between a local port and an anonymous port on the requesting host:

cache01:2323 -> otherhost:1237

Port numbers are defined in the TCP protocol as an unsigned 16-bit number which gives a maximum of 65535 ports. The local port number can, however, be re-used for connections to different hosts:

cache01:2323 -> otherhost:1237
cache01:2323 -> yetanotherhost:4545

This means that the maximum theoretical number of connections a cache server can handle is:

(65535 - reserved-ports) * incoming-ip-addresses

where reserved-ports is the number of ports reserved for system services by the operating system (usually 1024).

For this to work well, the load balancer in front of the cache must be transparent: that is, it must supply the IP address of the request source and not its own IP address.

For example, if three users are visiting your web site:

user1:2213 -> load-balancer:80 -> cache01:80
user2:1212 -> load-balancer:80 -> cache01:80    
user3:5333 -> load-balancer:80 -> cache01:80

then ideally, cache01 should see the IP addresses of the requesting clients (user1, user2 and user3) rather than the IP of the load balancer. Your cache server will then be able to handle as many TCP connection as your load balancer can pass on (given that your operating system kernel manages to allocate and recycle enough TCP connections fast enough).

If this is not possible then an alternative (but less satisfactory solution) is to increase the maximum number of possible connections by adding additional interfaces (and corresponding IP addresses) to the load balancer and/or the cache server.